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ABSTRACT 

We have developed an end-to-end photometric data processing pipeHne to compare current photo- 
metric algorithms commonly used on ground-based imaging data. This testbed is exceedingly adapt- 
able, and enables us to perform many research and development tasks, including image subtraction and 
co-addition, object detection and measurements, the production of photometric catalogs, and the cre- 
ation and stocking of database tables with time-series information. This testing has been undertaken 
to evaluate existing photometry algorithms for consideration by a next-generation image processing 
pipeline for the Large Synoptic Survey Telescope (LSST). We outline the results of our tests for four 
packages: The Sloan Digital Sky Survey's (SDSS) Photo package, DAOPhot and allframe, DoPhot, 
and two versions of Source Extractor (SExtractor). The ability of these algorithms to perform point- 
source photometry, astrometry, shape measurements, star-galaxy separation, and to measure objects 
at low signal-to-noise is quantified. We also perform a detailed crowded field comparison of DAOPhot 
and allframe, and profile the speed and memory requirements in detail for SExtractor. We find 
that both DAOPhot and Photo are able to perform aperture photometry to high enough precision to 
meet LSST's science requirements, and less adequately at PSF-fitting photometry. Photo performs 
the best at simultaneous point and extended-source shape and brightness measurements. SExtractor 
is the fastest algorithm, and recent upgrades in the software yield high-quality centroid and shape 
measurements with little bias towards faint magnitudes. Allframe yields the best photometric results 
in crowded fields. 

Subject headings: Data Analysis and Techniques 



1. INTRODUCTION 

^ ' The next generation of astronomical surveys will pro- 
, vide data rates and volumes that dwarf those of cur- 
I rent time-domain surveys (e.g. iTvsonI 120061 : iKaised 
' |2006[) . requiring commensurate advances in astronom- 
^— ^ , ical image processing and data management capabil- 
PsJ ■ ities. These surveys will enable synoptic study of 
T— I ■ such diverse scie nce aspects as th e minor planets of 
C . the solar system ([Jones et al.l l2006j) . Galac tic structure 
through color-magnitude () Juric ^eSnSOQl) and proper 
^ _ mot ion (|Munn et al.ll2004[ ) studies, time domain variabil- 
• ^ , ity (|Becker et al.ll2004l ). and the study of cosmological 
' dark matt er and dark en ergy using type la supernovae 
H \ (Wood-V asev et al.l I2007D . baryon acoust ic oscillations 
. 5t 1 ( Eisenstein et al.l l2005V galaxv clustering (*Bahca ll et al.l 
|2004|) ■ and weak lensing ( Zhan.2006. ) . These science goals 
require precision astrometric and photometric measure- 
ments of both stars and galaxies. The engineering chal- 
lenge in these surveys is to design and manufacture a 
system able to obtain data of requisite quality. The data 
management challenge is to reliably and rapidly trans- 
fer, analyze, and store the raw data and data products, 
with the algorithmic engineering challenge to realize the 
science goals through precision analysis of the data. 

The Science Requirements Document (SRD) for the 
Large Synoptic Survey Telescope (LSST'^) includes con- 
straints on point-source photometry and astrometry, as 
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well as on stellar and galaxy shape measurements. These 
requirements are not to be violated in data or in soft- 
ware. The goal of this research is to test the latter, given 
a large set of input data. In particular, the LSST SRD 
requires that the root-mean-square (RMS) of the unre- 
solved source magnitude distribution around the mean 
value is not to exceed 0.005 magnitudes in the 5, r, 
and i passbands, when supported by photon statistics. 
The measured photometric errors shall not exceed the 
quoted photometric errors by 10%. The RMS of the dis- 
tance distribution for stellar pairs with separations of 
5, 20, and 200' shall not exceed 10, 10, and 15 milli- 
arcseconds in the g, r and i-bands, respectively. Finally, 
for fields within 10 degrees of zenith, the r and i-band 
point-source ellipticity distribution will have a median 
value of no more than 0.04, and must be correctable to 
a distribution with a median no larger than 0.002. 
We compare here extant software packages in 

the context of these LS ST scie nce requirements. 

This includes DAOPh ot (jStetsonl Il987f). DoPho t 
(jSchechter et al.l I1993D. allf raime JStetsonl [l99l . 
SExtractor (iBertin k. ArnoutsI 119961 1 and Photo 



(jLupton et al.ll200^ . We have estabhshed quafity as- 
sessment metrics for comparing ensemble measurements 
of stellar positions, shapes, and brightnesses. Important 
algorithmic steps required to achieve this are the 
separation of stars and galaxies, and the deblending of 
neighboring objects. Because the absolute "truth" is not 
known here, these comparisons are by necessity relative. 
We compare the times required to reduce astronomical 
images, as well as memory consumption, when possible. 

While we have attempted to tune each package to ob- 
tain the best results for the ensemble of data, it is very 
likely that better results would emerge through individ- 



2 



Becker et al. 



ual study of each image. As such, this analysis reflects 
the results for a typical pipelined application of each 
package. 

We summarize the requirements for characterizing stel- 
lar and extended sources in astronomical images in Sec- 
tion [21 We describe the data used in the analysis in 
Section [31 our pipeline infrastructure in Section [H and 
summarize the algorithms we tested in Section [5l Our 
time-series database is outlined in Section [51 and the al- 
gorithms used to "cluster" single detections into multiple 
measurements of astronomical objects are described in 
Section [7l We discuss the methods used to select objects 
from our database in Section[8l We describe the results of 
our analyses regarding star/galaxy separation, photom- 
etry, shape measurements, centroiding, and photometric 
depth in Sections [TDHT31 We focus on a crowded-field 
analysis of globular cluster M2 in Section [TH and on 
algorithm timing and scaling tests in Section 1151 We 
conclude with an overall summary in Section 1161 

2. SOURCE MEASUREMENTS IN ASTRONOMY 

The problem of point source photometry is a well- 
studied one, with various solutions whose algorit hms dif- 
fer in their methods and implementation fe.g. iHowell 
1989"; 'Thomson et al.' 1992; Handler! l2003l : llvezic et al. 
2004: Pinheiro da Silva ct al. 2006). The problem re- 
quires the correct modeling of an image's point spread 
function (PSF), the transfer function of point sources 
though the atmosphere and the optics of the telescope. 
This solution typically includes an analytic model and an 
"aperture correction" that compensates for the limita- 
tions of the m odel (e.g. lTanvir et al.lll995HHan dlcr 2003; 
lKuiikenll2006D . In practice, the aperture and PSF fluxes 
are determined in a small aperture that is a small mul- 
tiple of the PSF full-width at half maximum (FWHM). 
The aperture flux is an unweighted measurement, while 
the PSF flux is derived using the PSF as the weight. 
The aperture fluxes of bright stars are next measured 
out to a very large radius, where one is reasonably cer- 
tain that all the light has been collected. The ratio of the 
bright star flux in the large and small apertures yields a 
multiplicative flux correction to the small aperture mea- 
surements. In general, these aperture corrections need 
to vary across an astronomical image because of spatial 
variation in the PSF. For very bright stars, aperture pho- 
tometry yields a more accurate measurement of the flux 
than PSF photometry, due to limitations of the analytic 
model. However, for faint stars near the sky limit, PSF 
photometry yields a more precise measurement of the 
flux, since aperture photometry includes many contribu- 
tions from sky pixels. 

Galaxy photometry is a much less studied issue, with a 
variety of pitfalls. Because of color changes in a galaxy's 
light profile, the correct aperture to use before becoming 
sky-noise dominated is a function of the passband one 
is observing in. Galaxies are also irregular in shape an d 
may be deblended non-uniqucly (Kush ner et al.l 120061 ). 
Typically, a basic symmetric model (deVaucouleurs, ex- 
ponential) is fitted to the light profile. For weak lens- 
ing science, which requires precision measurement of the 
shapes of galaxies (e.g. Bernstein & Jarvis 2002), adap- 
tive second moments of the light profile are used to quan- 
tify the ellipticity of galaxies. Photometric redshift mea- 
surements require the consistent accounting of flux in a 



variety of passbands, and thus ideally requires a simul- 
taneous ensemb le measurement of im ages taken through 
diflferent filters (jGolfister et al.ll2007D . 

3. THE DATA 

One of the algorithms under study is the photometric 
reduction pipeline used by the Sloan Digital Sky Survey 
(SDSS) : Photo. Photo is one of the few packages, and 
the only one analyzed here, that consistently performs 
both stellar PSF and extended source photometry, and 
represents a solid precursor pipeline for future surveys. 
However, Photo has been designed to operate solely on 
data from SDSS; testing of this algorithm requires that 
we operate on data from SDSS. 

SDSS uses a dedicated 2.5m telescope (jGunn et al.l 
,2006.) to provide simu ltaneous 5-band imaging 
(u. a. r. i. z: iFukugita et al] [r996l. The imaging camera 
contains 30 photo metric GGDs arranged in 6 columns 
(jGunn et al .11199^ . The images are obtained in drift- 
scan mode, and "fields" are defined corresponding to a 
scan length of 9' (36 seconds of drift-scanning), with a 
field width of 14'. The five images corresponding to a 
given field, obtained in the order r — i — u — z — are 
simultaneously processed by Photo. 

We have chosen to use data from two photometric 
runs of SDSS equatorial Strip 82N for these compar- 
isons. These are runs 3437 (obtained MJD 52578) and 
4207 (MJD 52936). The data for run 3437 extend from 
311 deg < RA < 23deg (J2000), with median g,r, and 
i-band PSF FWHMs of 1.3", 1.1", and 1.1", respec- 
tively, and a median r-band sky brightness of 20.8 mag 
arcsec"^. The data for run 4207 extend from 305 deg < 
RA < 60 deg (J2000), have a median seeing of 1.4", 1.3" 
and 1.2" in the g^r and i-band data, and median sky 
brightness of r = 20.7 mag arcsec"^. There are approxi- 
mately 27k objects per square degree detected by Photo 
in these images. 

Because Photo determines the PSF model for a given 
image by using neighboring images (along the direction 
of the scan), the other algorithms would be at a disad- 
vantage when trying to measure the PSF from a single 
frame. For this reason, we "stitch" together 3 images 
along the direction of the scan into a 14' by 27' image, 
with the frame of interest being in the middle. The al- 
gorithms operate on the entire stitched frame, but we 
accept only photometry from the central section. 

4. THE ANALYSIS PIPELINE 

To control the application of each algorithm to the 
data, we require a form of middleware that records 
progress and distributes jobs. For this we have chosen to 
use the Photpipe software develope d by the Sup erMA- 
CHO and ESSENCE collaborations ([Smith et al. i r2002, V 

The majority of Photpipe is written in the Perl lan- 
guage. This provides the internal glue that strings to- 
gether the various processing steps. In general, the 
image-level computations are written in the C language. 
These applications are called by the Perl scripts. 

As a programmatic summary, the Photpipe pipeline 
consists of a series of stages^ each of which has actions 
which it undertakes, as well as dependencies on the suc- 
cessful completion of previous stages. By default, an en- 
semble of images is passed from stage to stage using input 
and output lists. We have added a stage for DAOPhot, 
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DoPhot, and SExtractor, whose actions are merely to 
reduce each image using the algorithm. Results of the 
analysis are ingested into our time-series database (Sec- 
tion [6]) . 

Wc made an effort to explore the response of DAOPhot, 
DoPhot, and SExtractor to different input parameters. 
However, because of the number of degrees of freedom 
available to each (of order 100 for both DoPhot and 
SExtractor; of order 10 for DAOPhot and 60 for the 
Perl-language scripts that control its application) it 
was unfeasible to find which combination of parameters 
yielded the optimal results for every analysis presented 
here. We did vary the obvious tuning parameters, such 
as the input FWHM and significance threshold for object 
detection, degree of variation and complexity in the PSF 
model, and clustering size for matching up the ensemble 
of detections, ingesting the results of each analysis into 
our database as a separate dataset. In total we ingested 
112 permutations of dataset, algorithm, and algorithm 
input parameters, and report here on those results that 
reflect our best pipelined application of each algorithm. 

5. THE ALGORITHMS 

In the following sections, we briefly summarize the 
photometry algorithms used in this analysis : Photo, 
DAOPhot and allframie, DoPhot, and two versions of 
SExtractor. More complete descriptions of each algo- 
rithm are given in the Appendix. 

The SDSS photometric pipeline Photo contains a com- 
plete suite of data reduction tools that take the raw data 
stream, apply reduction and calibration stages, and ex- 
tract photometry from the calibrated images. Because 
the images we are using have been pre-processed by 
Photo, we expect that Photo has a distinct advantage in 
the quality of its photometric measurements. The SDSS 
imaging point spread function (PSF) is modeled heuristi- 
cally in each band using a Karhunen-Loeve (K-L) trans- 
form. Objects are measured self-consistently across all 
bands, and their positions and brightnesses are fit using 
a variety of models, including PSF and extended source 
models. 

The DAOPhot package contains a set of algorithms pri- 
marily designed to do stellar photometry and astrome- 
try in crowded fields. The tools are included as either 
subroutines in the executable program daophot or as in- 
dependent executable programs. DAOPhot builds its PSF 
using multiple iterations of source detection, PSF mod- 
eling, and source subtraction. The PSF model includes 
an analytic form as well as a lookup table of corrections. 
While daophot operates on single images, allf rame per- 
forms simultaneous measurements of all sources from a 
stack of images. DAOPhot does not attempt to fully 
characterize extended sources. We designed a set of 
Perl-language scripts to automate the application of the 
DAOPhot package. While the scripts have proven to be 
robust in the iterative building of PSFs (Becker 200C|), 
they are also relatively slow. A significant fraction of the 
computing time spent running DAOPhot is due to this im- 
plementation choice, and not necessarily intrinsic to the 
DAOPhot source code. 

The DoPhot package is designed to robustly produce a 
catalog of stellar positions, magnitudes and star/galaxy 
classifications for detections from astronomical images. 
DoPhot was designed to work on a large number of im- 



ages quickly with little to no interaction with the user. 
However, the version of DoPhot tested here is not the 
original software implementation, but instead a version 
that has been extensively modified to operate robustly 
in the Photpipe environment. DoPhot uses a single PSF 
model that is not allowed to vary spatially, in contrast to 
Photo and DAOPhot, whose PSF models are allowed to 
vary across the image. 

SExtractor is designed to quickly produce reli- 
able aperture photometry catalogs on a large num- 
ber of astronomical sources. SExtractor has been 
used to produce object catalogs for a variety of 
astronomical imaging surveys to date such as th e 
NOAO Deep Wide- Field Survey ( Jannuzi lTDevlfTQQgh . 
GOODS-N Su rvev (iHook k GOODS Team 2002), Deep 
Lens Survev (iTvson et all 120 01). IRAC Shallow Sur- 
yev (lEisenhardt et all 120041)' and the MAST Survey 
(|Imhoff et al.lll999D . Aside from the ease of installation, 
SExtractor is also notable for its speed and versatility. 
It is one of the few packages that aspires to distinguish 
and photometer both stars and galaxies, although its 
lack of a PSF model limits the accuracy of faint point- 
source photometry. Newer versions of the software in- 
clude adaptive windowing functions to provide more ac- 
curate centroids and shapes than the default (isophotal) 
measurements . 

6. THE DATABASE 

To enable the following analysis, we installed a 
MYSQL client and server on our local computers and 
constructed a database to store our test results (both 
science and performance benchmarking). 

We developed a variety of Python-language scripts to 
help properly ingest data (pipeline versions, parameter 
files, file locations, etc.) into the database in an orga- 
nized manner. We ingested metadata on over 1000 SDSS 
images processed through Photo in five colors (ugriz) re- 
sulting in over 10 million detections in our Objects table. 
The main tables of our database are Image, Object and 
AlgRun. 

• Image: Metadata about images including data 
source (e.g. SDSS), date, exposure time, filter and 
a pointer to World Coordinate System (WCS) in- 
formation for the image. 

• Object: Data for sources (detections from an im- 
age) and objects (clusters of sources), including po- 
sition (x,y and RA/Dec), classification and vari- 
ous measures of intensity. In addition, sources are 
linked to the image on which they were detected. 

• AlgRun: Information about a particular run of a 
component, including the input parameters used 
for that run. All told, 112 instances of pipeline 
runs were ingested into the database, representing 
different combinations of input data, photometry 
algorithm, and input parameters. Both the Object 
and Image tables link to the AlgRun table. 

7. CLUSTERING OF SOURCES INTO OBJECTS 

After ingest of sources and images into the database, 
we require a method to associate sources into objects. 
This allows us to collate the ugriz data for a single as- 
tronomical object, as well as to match up the reductions 
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from different algorithms or from different nights. We 
use the OPTICS algorithm to do this clustering. 

The OPTICS algorit hm (Ordering P oints To Identify the 
Clustering Structure: lAnkerst et al.l [19991 is a density- 
based method to identify clusters of points in databases. 
In this ordering, a reachability distance is defined be- 
tween neighboring points. When this distance is ex- 
ceeded for neighboring points, the boundary of a clus- 
ter is defin ed. OPTICS is an improvement of the DBSCAN 
algorithm (jEster et al.l ll996l. 

The user provides a minimum number of points to de- 
fine the cluster core. In our case, for a given object we 
have 4 algorithms operating on 5 filters and 2 nights of 
data, meaning we ideally expect 40 points in a cluster. 
We run OPTICS requiring a minimum of 5 points to in- 
clude objects missed in some filters due to their color, 
missed on some nights due to different image depths, or 
missed in different algorithms due to the vagaries of the 
software. Since we only have 3 algorithms besides Photo 
running on these data, an artifact in one image and in one 
filter should not lead to a spurious cluster. We do how- 
ever find spurious clusters in the wings of bright stars, 
where multiple algorithms may detect signal in multiple 
passbands on multiple nights. 

The user also defines reachability distance e for a given 
core set of points. For all points in this neighborhood, 
all points within e of it are searched, repeating until no 
more points can be added to the cluster. The data are 
stored in a tree-based spatial index. A search in the 
neighborhood e of a given object scales with the number 
of points N a.s N log(N). We chose a clustering distance 
of 1 pixel (0.4"). 

One way we found to optimize the clustering was to 
relate the size of each page in the database to the length 
of the input list to be clustered. We found that too large 
(or too small) a page size would impact the computation 
of the clustering by an order of magnitude. Figure [1] 
demonstrates the OPTICS run time as a function of the 
number of points per page (or "leaf") in the database. 

8. METHODOLOGY 

In this section and those below, we describe the prac- 
tical methods used to quantify DAOPhot, DoPhot, Photo, 
and SExtractor. 

Our analyses are designed to ascertain the level of sys- 
tematics inherent to each photometry algorithm by com- 
paring the measured properties of objects on multiple 
nights. We also compare brightness, shape, and cen- 
troiding measurements by the different algorithms on the 
same imaging data. We start with the assumption that 
Photo's star-galaxy classification is "truth" , and use this 
information to derive similar classification boundaries for 
the other algorithms. We then repeat our analyses using 
these new algorithm-derived boundaries. 

Our initial queries to the Ob j ect table select all objects 
from the comparison algorithms, but only a subset of de- 
tections from Photo. We only include Photo detections 
where the objcJlags"' suggest that it is not SATURATED, 
BLENDED, or BRIGHT, was found in the BINNEDl image, 
and was not DEBLENDED_AS_MOVING. These objects essen- 
tially serve as the "seed" objects that we use for cluster- 
ing. 

* http:/ /www. sdss.org/dr5/products/catalogs/flags. html 



We start this process by selecting only clusters where 
Photo has detections in both runs that it thinks are stars. 
This criterion is used to select measurements from other 
algorithms to be used for magnitude zero-pointing, de- 
termination of star-selection criteria, and comparison of 
shape measurements and photometric depth. We use 
PSF magnitudes when available, and aperture magni- 
tudes otherwise^. 

DAOPhot, DoPhot, and SExtractor report their results 
in instrumental magnitudes, and we have to derive zero- 
point offsets if we want to directly compare their data 
to Photo. For each algorithm, filter, and run combina- 
tion, we take all Photo-selected stars and find the 3- 
sigma clipped average difference in magnitudes between 
Photo and the algorithm (we use aperture magnitudes for 
SExtractor; PSF magnitudes for DAOPhot and DoPhot). 

9. STAR/GALAXY SEPARATION 

The initial step in this analysis is to define star/galaxy 
boundaries for each algorithm. To do this, we select all 
objects that Photo classifies as stars and galaxies, and 
plot the distribution of the star/galaxy separation met- 
rics from each algorithm. In particular, we have cho- 
sen to use Sharp for DAOPhot, Type for DoPhot, and 
CLASS_STAR for SExtractor. By studying the distribu- 
tion of these parameters, we can derive star/galaxy clas- 
sification schemes for each algorithm. For all Photo- 
selected stars and galaxies, we plot each algorithm's 
star/galaxy parameter in 4 magnitude bins : 14 < r < 
20; 20 < r < 20.5; 20.5 < r < 21; 21 < r < 22. Each win- 
dow contains a histogram and the cumulative distribu- 
tion of that parameter plotted as a dashed line. We show 
example results for DAOPhot in Figure^ and SExtractor 
in Figure [3] 

9.1. Results Using Photo's Classification 

In DAOPhot, Sharp for stars is distributed in a near 
Gaussian that is centered on value 0.0 with a charac- 
teristic width. Figure [2] shows the r-band distribution 
from run 4207. The data are split into 4 magnitude bins. 
The distribution for stars are plotted in the left figure; 
for galaxies on the right. As expected, the width of the 
stellar Sharp distribution widens as you go to fainter ob- 
jects, from 0.04 at the bright end to 0.17 at the faint 
end. The parameter distribution for galaxies remains 
relatively constant with magnitude. We have combined 
the analyses from runs 3437 and 4207, and calculated the 
width of the stellar distribution in the brightest bin. The 
mean and width of this distribution is listed in Table [TJ 
We define our filter-dependent DAOPhot star-selection 
criterion as anything having Sharp within 3(7 of the mean 
in the brightest bin. We define galaxies as those objects 
with Sharp larger than +Sa from the mean. Anything 
with Sharp less than — Su from the mean is sharper than 
the PSF and likely to be an image artifact. We note 
that other selection criteria are possible and may lead 
to better results, such as using parameters Sharp and 
Chi in combination. However, Sharp's highly symmet- 
ric distribution for stars and highly skewed distribution 
for galaxies in Figure [2] suggests that it is appropriate, 
although not necessarily optimal, to use it as the sole 

^ Aperture photometry is performed at a radius of 7.4" 
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criterion. The same is true for the other metrics defined 
below. 

DoPhot returns a Type parameter for each object it 
measures. A Type = 1 object is considered a "perfect" 
star, and is used in the computation of the weighted PSF. 
A Type = 3 object is not as peaked as a single star, and 
is assumed to be a blend. It is however photometered 
with a single PSF. A Type = 7 object is too faint to do 
a full 7-parameter fit, so a 4-parameter fit was under- 
taken. We found that stars in our data had almost ex- 
clusively Type = 1, with very few having Type = 7. We 
found that galaxies tended to have Type = 3 or Type = 
1, with a small fraction of Type = 7. Since this is our 
only selection criterion, we select stars as all objects with 
Type = 1 and galaxies as all objects with Type = 3, rec- 
ognizing that our stars will have non-zero contamination 
by galaxies. 

In SExtractor, CLASS_STAR is designed to be a 
star/galaxy classification toggle, where a value of 1 rep- 
resents an object highly likely to be a star. This requires 
that the correct input FWHM be applied for the filter- 
ing to work optimally. Therefore we use the FWHM 
as derived by Photo as inputs to SExtractor. As Fig- 
ure [3] shows, this parameter tends to work well. The top 
panel shows the distribution for stars, and the bottom 
for galaxies. For all filters except for u-band, we chose 
a cutoff of CLASS_STAR = 0.8 as the line separating stars 
from galaxies. In the u-band, many of the stars are also 
distributed near CLASS_STAR = 0, and we lowered our 
delineation to CLASS_STAR =0.2. 

The extent of galaxy contamination in these algorithms 
is summarized in Table [2] and Table [S] We list in Table [2] 
the total fraction of objects that were classified as stars 
by both the algorithm and Photo (S-S); as stars in the 
algorithm and galaxies in Photo (S-G); as galaxies in 
the algorithm and stars in Photo (G-S); and galaxies in 
both algorithms (G-G). We make a similar comparison 
in Table[31 which lists the fraction of all objects that each 
algorithm (mis)classified in both runs. We limit this se- 
lection to objects brighter than 21^** magnitude, where 
Photo's star-galaxy separation has been tested exten- 
sively and is considered "truth" for the purposes of these 
comparisons. 

From Tabled! we see that DoPhot and Photo disagree 
on anywhere from 1 to 10% of all bright objects (in- 
creasing to ~ 20% when looking at all brightnesses). In 
general, DoPhot is more likely to classify something as a 
star that Photo thinks is a galaxy. The fraction of de- 
tected Photo-classified galaxies is also lowest in DoPhot, 
suggesting that this algorithm is very inefficient at de- 
tecting galaxies, and biased towards classifying galaxies 
it does find as stars. SExtractor tends to disagree with 
Photo in the opposite sense - SExtractor is likely to call 
something a galaxy that Photo classifies as a star. Run 
3437 is particularly egregious in this regard. The most 
obvious cause is that we fed the wrong initial estimate of 
the stellar FWHM (derived from the Photo analysis) to 
the package, and it was therefore making poorly informed 
choices for star/galaxy separation. However, runs 3437 
and 4207 were treated equally in this regard, so this is 
likely not the culprit. 

DAOPhot agrees with Photo a large fraction of the time, 
and is slightly more likely to call a Photo-classified star 
a galaxy than a Photo-classified galaxy a star. We have 



created plots such as Figure [4] to investigate each permu- 
tation of (mis)classifications. These depict color-color 
diagrams of objects classified in g, r, and i as either 
stars or galaxies. We plot here only the bright objects 
(14 < r < 20) classified by both DAOPhot and Photo in 
run 3437 (the figure for run 4207 is very similar) . To yield 
a point on this diagram, the object must be classified the 
same by each algorithm in all 3 passbands. Thus the frac- 
tion of objects in each window will slightly disagree with 
the entries in Table[21 Its clear that the misclassifications 
(the off-diagonal plots) are drawn more from the stellar 
than the galactic locus, thus we conclude that DAOPhot 
correctly calls some objects stars that Photo incorrectly 
calls galaxies, and vice versa. 

9.2. Results Using Each Algorithm's Classification 

We also investigate the consistency within a given al- 
gorithm by looking at the classifications of the same ob- 
ject detected in both runs. This is listed in Table [31 As 
discussed above, DoPhot is biased towards calling objects 
stars, but shows here that it is very self consistent in that 
regard. SExtractor classifies a higher fraction of objects 
as galaxies than do the other algorithms, and apparently 
had difficulty with objects classified as stars in 4207 and 
galaxies in 3437. DAOPhot disagrees with itself for 12% of 
objects, while Photo is the most consistent (~ 2%) with 
regards to misclassifications of these bright objects. We 
note that if we examine the entire sample of clustered 
objects, including objects fainter than 21** magnitude, 
the misclassification rates in Table [3] degrade worst for 
Photo, increasing from ~ 2% to ^ 12%. The ratios for 
the other algorithms tend to remain constant at fainter 
magnitudes. 

9.3. Classification Conclusions 

Both DoPhot and SExtractor have inadequacies in 
their star/galaxy classification schemes as derived in this 
experiment. It is very likely that improvements can be 
made to SExtractor using the non-linear filters from 
Enhance Your Extraction (EyE) ^, and it should be 
carefully considered as an option with the potential to 
contribute to LSST algorithm development. Surpris- 
ingly, DAOPhot does a better job at classification than 
these algorithms, although its galaxy characterization 
methods are limited. Photo is the best all-around pack- 
age in this regard due to its extensive analysis and char- 
acterization of each object. 

10. PHOTOMETRY 

For Photo-selected stars and galaxies, we calculate the 
difference of an object's magnitude as measured by a al- 
gorithm algl in runl and algl in run2, or by algorithm 
algl in runl and algorithm alg2 in runl. We plot these 
distributions as a function of magnitude. We do this 
for both aperture and PSF (when available) magnitudes, 
and for stars and galaxies. Example r-band results for 
DAOPhot are shown in Figure [5] for both aperture and 
PSF photometry. Each figure contains four panels, de- 
scribed below. 

10.1. Panel 1 
^ http: //terapix. iap.fr/soft/eye 
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The differences in measured magnitudes (AM — Mf - 
M2) are plotted as a function of Photo's magnitude. The 
median AM of objects brighter than IS**^ magnitude (or 
the brightest magnitude plus one if no objects brighter 
than 18*'' are present; typically this uses thousands of 
objects) was subtracted off of the entire distribution, so 
that it is centered on y — 0. We cut out the brightest 
and dimmest 0.5% of the data to avoid outliers. At the 
bright end, the width stops following Poisson statistics 
and levels off at a characteristic width indicative of sys- 
tematics in the analysis. It is this width that we choose 
to characterize our algorithms. 

For aperture magnitudes, the systematic floor is 
smaller at the bright end because there is no reliance on 
any PSF model, and aperture measurements are ideally 
Poisson limited. This distribution shows a characteristic 
broadening at fainter magnitudes as measurements be- 
come sky-noise dominated. We naively expected most 
algorithms to perform similarly well in aperture magni- 
tude measurements. However, there are enough degrees 
of freedom in centroiding and in treating the brightness 
of neighboring objects that these results in actuality are 
significantly different. 

For PSF magnitudes, the bright-end systematic floor is 
much larger due to reliance on a PSF model which is cer- 
tain to be incomplete at some level. Ideally, gross errors 
in the PSF model come out in the aperture correction, 
and this systematic floor is then indicative of the degree 
of spatial variation in the aperture corrections. At fainter 
magnitudes, the distribution remains much tighter than 
for aperture measurements since sky noise does not con- 
tribute as much in a PSF-weighted measurement. 

10.2. Panel 2 

We divide the AM distribution into 10 bins. The points 
in each bin are sorted by AM and the first (Ql) and third 
(Q3) quartile are determined (the indices corresponding 
to 0.25 and 0.75 the length of the sorted array, respec- 
tively). The value of the points associated with Ql and 
Q3 are used to determine the interquartile range (IQR) 
of these data. We choose to use the IQR to lessen our 
sensitivity to outliers (such as variable stars). 

We find the uncertainty in this width by assuming the 
data are normally distributed, where (Tmean = 0.74* IQR 
and (Tmcdian = ■\/7r/2 * (Jmcan ■ The Standard deviation in 
the IQR is (Tiqr = y/n * 0.55 * IQR. The uncertainty in 
the IQR is ctiqr/VA^ - 1. 

We plot (Jiiiean and its uncertainty (as derived from the 
IQR) in each bin. These data are then fit with the func- 
tional form A + Bz + Cz"^, where z = 10° ''*^^, which de- 
scribes well the growth of this envelope with magnitude. 
This best fit is plotted as a solid line. We evaluate this 
equation one magnitude below the brightest data point, 
and use this single number to characterize the systemat- 
ics inherent in the comparison. The 3 — cr envelope al- 
lowed by this relationship is plotted in Panel 1. These re- 
sults are summarized in Table|3]for Photo-selected stars. 

We note that the LSST Science Requirement Docu- 
ment states that photometry should be reproducible to 
0.005 magnitudes. That translates into a systematic bin 
width at the bright end of V2 * 0.005, or 0.007 magni- 
tudes. 



10.3. Panel 3 

We evaluate and plot the fraction of stars in Panel 1 
that are more than 3 — cr from the mean. For night- 
to-night comparisons, this is very sensitive to the level 
of variability in the sample. For algorithm-to-algorithm 
comparisons on a given set of data, it allows us to uncover 
differences in the algorithms. 

10.4. Panel 4 

We add in quadrature the uncertainties associated with 
each component Ml and M2 and plot the distribution of 
AM / cr AM- These data are binned, and we derive each 
bin's IQR and its uncertainty and overplot these points. 
If the photometry packages accurately quantify the mea- 
surement uncertainties, these binned points should all lie 
near 1.0. 

10.5. Results Using Photo-Selected Stars 

We have designed three variants of the tests described 
above to characterize the algorithms' photometric perfor- 
mance : comparing photometry of data taken on different 
nights as an overall characterization of each algorithm; 
comparing different algorithms' photometry of the same 
data, providing a relative characterization that is insen- 
sitive to stellar variability; and comparing aperture and 
PSF magnitudes from the same algorithm on the same 
data, yielding an estimate of the scatter introduced by 
spatial variation of the aperture corrections. 

We first characterize the photometric accuracy of each 
algorithm by comparing the brightness of Photo-selected 
stars measured in both SDSS runs. Figure [5] shows exam- 
ple r-band summary plots for DAOPhot in both aperture 
and PSF photometry for Photo-selected stars. The width 
of the AM distributions are summarized in Table SI We 
note that both Photo and DAOPhot produce g, r, and 
i-band aperture photometry that meets LSST's SRD on 
photometric accuracy. No other algorithms are able to 
meet this requirement, failing to reach the benchmark of 
0.007 magnitudes. We note that no algorithms are able 
to meet the SRD in PSF photometry - the numbers con- 
sistently fall short by a factor of 2-3. DoPhot performs 
worst in terms of PSF photometry. 

Most algorithms tend to underestimate aperture mag- 
nitudes errors of bright objects compared to the empirical 
scatter, with the exception of Photo which tends to over- 
estimate the aperture errors of bright objects by as much 
as a factor of 2. SExtractor underestimates the aperture 
errors of all objects by a factor of 2-3. Photo's PSF mag- 
nitude errors represent the empirical scatter very faith- 
fully. DoPhot and DAOPhot underestimate their PSF er- 
ror uncertainties by ^ 20%. 

We next look at the width of the AM distribution 
for different algorithms running on the exact same data. 
This is insensitive to stellar variability, and allows us 
to localize any differences to the algorithms themselves. 
The results for the r-band are listed in Tables [5] and 
[6] for PSF and aperture photometry, respectively. The 
aperture results are very similar for all pairs of algo- 
rithms, while the PSF photometry comparison of Photo 
to DAOPhot is superior to any comparison using DoPhot. 

Finally, we compare aperture and PSF magnitudes 
from the algorithms, yielding an estimate of the addi- 
tional scatter coming from spatial variation in the aper- 
ture corrections (Table [7|). We limit our comparison to 
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Photo-selected stars. A-priori, we expect Photo to out- 
perform all other algorithms here, since its PSF mag- 
nitudes have already been aperture corrected. Ideally, 
the scatter here should be very close to the aperture 
photometry results in Table H) Tabic [7] indicates that 
Photo's results are equivalent to DAOPhot's, and closer 
to the PSF photometry scatter than the aperture pho- 
tometry scatter. This suggests that Photo's aperture 
corrections have not successfully accounted for spatial 
variation in the PSF. The numbers in Table[7]do tend to 
bridge the difference between the aperture and PSF scat- 
ter in Tabled! verifying that the PSF photometry scatter 
contains a baseline contribution from the aperture pho- 
tometry and an additional contribution from aperture 
corrections. 

10.6. Results With Algorithm-Selected Stars 

We repeat this analysis using objects each algorithm 
selects as a star. These results are listed in Table [H 
and are very similar to the Photo-selected analysis. The 
largest difference is that the fraction of 3cr outliers in- 
creases by a factor of 2-3, indicating that the star-galaxy 
classification schemes for the algorithms are inferior to 
Photo's. Some fraction of this additional scatter comes 
from not knowing exactly which pixels in the images have 
been interpolated over by Photo due to cosmic rays or 
bad pixels. 

10.7. Photometry Conclusions 

The aperture and PSF photometry from DAOPhot and 
Photo are clearly superior. In particular, DAOPhot per- 
formed as well as Photo, which is encouraging as Photo 
was designed and commissioned with this SDSS data set 
in mind. 

No algorithms were able to meet the LSST SRD in 
terms of PSF photometry. The ideal aperture correc- 
tions to the PSF photometry should bring the PSF scat- 
ter in-line with that from the aperture photometry. The 
only algorithm for which this degree of calibration has 
been done is Photo. However, it appears that Photo has 
not sufficiently compensated for spatial variations in its 
aperture corrections to PSF magnitudes, since its aper- 
ture vs. PSF scatter are commensurate with DAOPhot's. 

As far as calculating uncertainties, the PSF magnitude 
errors from Photo most closely track the empirical un- 
certainties. Aperture photometry uncertainties are either 
over or underestimated in all algorithms. 

It is clear that the task of PSF photometry still requires 
significant research and development if LSST is to meet 
its SRD in terms of photometric accuracy. 

11. SHAPE MEASUREMENTS 

For the Photo-selected stars and galaxies, we ex- 
tract the algorithm shape parameters Ixx, lyy, and Ixy 
(DAOPhot does not report these values on an object-by- 
object basis). We calculate the ellipticities derived from 
these moments 

Ixx — lyy „ 2Ixy , , 

el = — ; e2= — (1) 

Ixx -I- lyy Ixx -I- lyy 

and generate figures comparing each algorithm's shape 
measurements to Photo's, dividing the data into 4 magni- 
tude bins. We plot a linear relationship between Photo's 



shape and that from the algorithm. The RMS of the 
scatter about this line is calculated and listed in Table [9] 
for Photo-selected stars, and Table [TUl for Photo-selected 
galaxies. Figure [6] shows a representative set of figures 
comparing r-band Photo and SExtractor cllipticity pa- 
rameters from run 3437. 

11.1. Shape Measurement Results 

SExtractor is the only algorithm that we tested which 
reliably calculates the shapes of galaxies, thus we have 
limited our comparison of shape measurements to Photo 
and SExtractor. In addition, for ease of tabulation and 
interpretation, we present only the results of the r-band 
analyses. We note that the g and i-band results are quan- 
titatively similar. 

We compare the ellipticities derived from both the 
"isophotal" shape measurements from SExtractor 2.3.2 
and the "windowed" measurements from SExtractor 
2.4.4. The linear relationships between Photo's and 
SExtractor's r-band measurements, in the form ephoto 
= A + B esExtractor, are shown in Table [5] for stars, and 
Table \W\ for galaxies. We report these numbers for the 
brightest magnitude bin (14 < r < 20). We also list the 
RMS scatter about this line. 

We first note the significantly reduced scatter from the 
best-fit linear relationships when using the "windowed" 
shape measurements from SExtractor 2.4.4. In partic- 
ular, this yields up to an order of magnitude less scatter 
in the stellar shape measures (Table [9|) , suggesting that 
SExtractor 2.3.2 is not to be used for determining stellar 
shapes and ellipticities. The improvement for galaxies is 
a more modest factor of 3 (Tabic [T0|) , but still very sig- 
nificant. 

The ellipticities of galaxies in SExtractor 2.3.2 is sim- 
ilar to in Photo (slope ^ 1); the ellipticities of both 
stars and galaxies in SExtractor 2.4.4 is different than 
in Photo (slope ^ 2.0 for stars, ~ 1.8 for galaxies). Fig- 
ure [5] shows an example plot of ellipticity comparisons 
for Photo-selected galaxies. The left panel shows this re- 
lationship for SExtractor 2.3.2, and the right panel for 
SExtractor 2.4.4. The isophotal measurements clearly 
lead to a tighter relationship. 

11.1.1. Shape Measurement Conclusions 

Adaptive second moments are more reliable than 
isophotal moments. We recommend that all SExtractor 
analyses relying upon shape measurements use "win- 
dowed" shape measures. Non-windowed shape measures 
should not be used for stars. 

12. CENTROIDING 

We also compare centroiding offsets between objects as 
measured in the same images by different algorithms. To 
do this accurately, we must first determine the conven- 
tions used to describe the image array. For both DAOPhot 
and SExtractor, the center of the lower-left hand cor- 
ner pixel (LLHC) is coordinate (1.0, 1.0). In Photo and 
DoPhot, the LLHC is at coordinate (0.5, 0.5). 

We perform an analysis similar to that described in 
Section [TO] but describing the distribution of pixel off- 
sets as a function of magnitude. This should reveal any 
centroiding biases as a function of magnitude. Exam- 
ple Figure [7] includes the three panels described in Sec- 
tion [TOtI Section [ro?2l and Section [TO?3l Here the width 
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of the bright end of the distribution in Panel 1 reflects 
centroiding systematics. 

We also plot in each Panel 1 a quadratic fit to the 
median value of the X, F-coordinate pixel offsets of the 
form Ax,Y = A + Bz + Cz'^, where z = M ~ MO, MO 
is the magnitude of the first (brightest) bin and M the 
central magnitude for each bin. We plot the median val- 
ues and their uncertainties, and the functional fit as a 
solid line. Any shape to this distribution {B ^ C 0) 
suggests systematics in object centroiding as a function 
of magnitude. These results are summarized in Table [TT] 
for Photo-selected stars. Table [T2] shows the width of this 
distribution, evaluated f magnitude below the brightest 
unsaturated star, comparing algorithm to algorithm for 
r-band centroids in run 3437 (upper triangular matrix) 
and run 4207 (lower triangular matrix). 

12.1. Centroiding Results 

We compare the measured positions of objects in each 
image as a function of magnitude. Accurate centroiding 
is required to deliver the SRD relative astrometry re- 
quirement of 0.01" (here 0.025 pixels). We are unable to 
comment on the absolute astrometry requirements since 
that involves knowledge of astrometric distortions in the 
focal plane, which are different here than will be the case 
in LSST. 

We list the results of the quadratic fit in Table [11] 
for Photo-selected stars. SExtractor 2.3.2 consis- 
tently has significant offset, linear, and quadratic terms. 
DoPhot rarely shows significant quadratic terms, but 
tends to have significant zeropoint offsets at ~ 0.01 
pixels. Both DADPhot and SExtractor 2.4.4 compare 
very well with Photo's positional measurements, rou- 
tinely having offsets below 0.005 pixels, linear terms be- 
low 0.003 pixels/magnitude, and quadratic terms below 
0.001 pixels/magnitude^. 

An example demonstrating the improvements between 
SExtractor 2.3.2 and SExtractor 2.4.4 is shown in Fig- 
ure [71 Here we plot 2 figures containing the three panels 
described in Section [121 The left panel shows the distri- 
bution of z-band AX pixel offsets between SExtractor 
2.3.2 and Photo. The right panel provides a comparison 
between SExtractor 2.4.4 and Photo. It is clear there is 
a much smaller trend of the median pixel offset with mag- 
nitude in SExtractor 2.4.4, as well as a smaller overall 
RMS to the distribution. 

We use this RMS at the bright end to further char- 
acterize the centroiding accuracy. This comparison of 
all algorithm centroids is shown in Table [T^l for r-band 
x-coordinate centroids. This table indicates that the al- 
gorithms are much more consistent with each other than 
they are with Photo, as the RMS is consistently high- 
est in those comparisons including Photo. Compared 
to RMSs of order 0.02-0.03 pixels for comparisons with 
Photo, the other algorithms are consistent to 0.01 pix- 
els or better. We trace this back to Photo's astrometric 
corre ctions derived from the PSF behavior (jPier et al.l 
|2003| ). which the other algorithms do not account for. 
These corrections demonstrably produce better absolute 
astrometry, since they account for biases in positions due 
to the complex PSFs. We thus expect relative astrometry 
to be accomplished in software to better than 0.01 pixels, 
or more than 200 times smaller than the image FWHM. 
Absolute astrometry may require corrections similar to 



what has been undertaken by SDSS. 

12.2. Centroiding Conclusions 

The LSST SRD relative astrometry requirement of 
0.01" (1/70 the median SRD r-band seeing of 0.7 ") is not 
likely to be violated in software. The "windowed" cen- 
troids of SExtractor 2.4.4 are comparable to the PSF 
centroids of DAOPhot and Photo, and a significant im- 
provement over SExtractor 2.3.2. 

13. PHOTOMETRIC DEPTH 

We select all clustered objects that have been classi- 
fied as a star by each algorithm for each run, and create 
star count histograms. We find the bin with the maxi- 
mum number of stars found by each algorithm, as well 
as the cumulative fraction of the histogram as a function 
of magnitude. We characterize the photometric depth of 
each algorithm by determining the magnitude bins be- 
low which 95% (M95) and 99% (M99) of the objects have 
been detected. These values, as well as the peak of the 
functions, are listed in Table [T3l 

13.1. Photometric Depth Results 

Using M99 as a proxy for photometric depth. Photo 
is consistently deeper than DAOPhot and DoPhot in PSF 
magnitudes, in many cases significantly. We can trace 
this back to the definition of "significance" in the ob- 
ject detection stages. For example, DAOPhot triggers off 
the central pixel of an object in the image convolved 
with its PSF, yielding a weighted sum of neighboring pix- 
els. Photo does a similar smoothing, but also grows the 
source by an amount approximately equal to the radius 
of the seeing disk, and defines a source as a connected set 
of pixels that are detected in at least one of the 5 pass- 
bands. Unfortunately, it is not sufficient to merely lower 
DADPhot 's object detection threshold to compensate for 
these differences without also enacting a change in how 
the algorithm evaluates the notion of "significance" . By 
lowering the threshold we would be allowing an unaccept- 
able number of artifacts through along with the fainter 
astronomical objects. The ideal object detection algo- 
rithm would trigger off of medium significance pixels and 
determine the integrated significance of all neighboring 
(e.g. 8-connected) pixels, comparing the latter to the 
user-defined detection threshold. 

The comparison between Photo and SExtractor is 
slightly more difficult, since aperture photometry is not 
the ideal measurement to use in star count comparisons. 
For example, the peaks of Photo's aperture photom- 
etry star-counts are frequently 2-3 magnitudes fainter 
than for its PSF star-counts. At least for the g and r 
passbands, the metric M99 is approximately the same 
for aperture and PSF photometry, so we use these fil- 
ters in our SExtractor comparison. On both nights, 
SExtractor stops more than 1 magnitude brighter than 
Photo in g, and slightly less than 1 magnitude in r. 

13.2. Photometric Depth Conclusions 

It is difficult to compare photometric depths in the con- 
text of incomplete star/galaxy separation schemes. The 
star counts of all algorithms are contaminated to some 
degree by galaxies. However, because Photo measures 
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and deblends stars and galaxies simultaneously, we be- 
lieve this yields the most accurate classification criteria, 
and thus the most accurate star counts. 

DAOPhot is primarily designed to photometer stars, and 
while it does a reasonable job of agreeing with Photo on 
object classification (Table [3]), it also is over-complete 
compared to Photo for brighter objects, where Photo is 
known to do well, and is also incomplete for fainter ob- 
jects. The former is likely due to detection of artifacts 
in the images, as well as misclassification of galaxies as 
stars. 

14. ANALYSIS OF GLOBULAR CLUSTER M2 

Globular Cluster M2 (NGC 7089) is located in our 
imaging strip. This cluster contains approximately 
150,000 stars, with a core radius of 0.34". This is a 
highly concentrated structure, and will test the limits of 
any photometric software tasked to analyze it. In fact, 
the majority of Photo's attempts to reduce images con- 
taining this cluster are unsuccessful, failing at the stage 
of dcblcnding. 

We have chosen to use this particular field to test 
daophot's and allf rame's abilities to do stellar photom- 
etry in crowded fields. With the vast majority of objects 
in these images being cluster stars, we expect minimal 
contamination from background galaxies. We do however 
expect to encounter problems with the brightest cluster 
stars (13*^ magnitude), which saturate in the standard 
SDSS exposures. In the images we are using, saturated 
pixels and bleeds have been interpolated over by Photo, 
leaving the profiles of these objects inconsistent with the 
PSF. DAOPhot is therefore inclined to consider these ob- 
jects extended, and will fit an ensemble of PSFs to the 
object until enough have been added to "vacuum" up all 
of its flux. 

This analysis will also serve as a proxy for how close 
LSST can observe to the Galactic plane and still maintain 
a given level of photometric precision. However, in such 
crowded fields, aperture photometry is neigh impossible. 
And as Section [10] has shown, PSF photometry is unable 
to produce results with the required accuracy. It is un- 
clear if it is possible, even in the most idealized case, for 
the SRD requirements to be met in such crowded fields. 

14.1. Photometry 

Due to the degree of stellar crowding in this field, 
OPTICS clustering runs yielded marginal results with a 
clustering distance of 1 pixel (0.4"). This was charac- 
terized by large scatter when matching the centroids of 
objects in daophot and allf ramie, at the level of 0.8 pixel 
RMS in the r-band. We instead chose to cluster the data 
with a half pixel (0.2") clustering distance, which yielded 
much improved results (RMS scatter of 0.04 pixel in the 
r-band). Clustering at a quarter pixel (0.1") did not 
significantly alter the results. 

The results for the AM distribution measurements are 
listed in Table [T4l For both algorithms, we used the star- 
galaxy classification schemes derived from the previous 
analyses and described in Table [TJ 

The results of this analysis are very encouraging. We 
first note that the first two sets of data (daophot and 
allf rame) in Table [14] correspond to objects classified 
by daophot as stars. To have clustered with daophot 
detections, this subset of the data will not reach as deep 



as the full allf rame reductions. Therefore these num- 
bers do not directly refiect allframe's photometry of 
faint objects, but instead the fact that allf rame is bet- 
ter able to deblend the stars used in this analysis from 
faint objects that were missed in daophot. The sec- 
ond set of allf rame results are for objects classified by 
allfrcune as stars, and thus also probes the distribu- 
tion of stars missed in daophot because they were too 
faint or blended. We emphasize that the PSFs used in 
the two analyses are exactly the same, and any improve- 
ments may be directly attributed to better deblending 
and centroiding. 

The aperture photometry results are considerably 
worse here than as reflected in the sparse-field analy- 
sis described in Table |4] and Table [8] This is to be ex- 
pected, as the field is extraordinarily crowded and there 
is a very steep and significant background sky gradient 
due to unresolved cluster stars. Both the r and i-band 
aperture results are considerably worse than in the other 
passbands, in this case due to the extreme crowding con- 
ditions in these filters. 

The PSF photometry shows a marked improvement 
over the aperture photometry results, particularly in the 
r and i-band data where the images are most crowded. 
The g-band PSF photometry is the most problematic in 
the DAOPhot reductions. However, the magnitude scat- 
ter for objects classified by daophot as stars is reduced 
by approximately 25% when going to the stacked anal- 
ysis of allframe. In particular, the g-band photome- 
try improves significantly, suggesting that DAOPhot did 
a poor job of selecting all the stellar 5-band objects, 
and a proper deblending was only possible by using con- 
straints from the r and j-band data. We also note that 
the allframe PSF photometry results are commensu- 
rate with the sparse-field analyses described in Table [1] 
and Table [8] This indicates that daophot+allf rame is 
indeed a powerful combination that is able to perform 
consistent stellar PSF photometry across the range of 
crowding conditions expected in LSST. 

The final set of numbers in Table [TJ] reflecting the 
analysis of objects classified by allframe as stars, shows 
a slight increase in the scatter of photometric measure- 
ments. The degradation is likely due to the impact of 
allframe detecting fainter, more crowded objects, for 
which photometry is more difficult. However, the PSF- 
photometry results are still better than daophot's single- 
image analysis of this field, and essentially equivalent to 
the sparse-field analysis results presented in Table [8] 

14.1.1. Photometry as a Function of Crowding 

Given the broad range of stellar densities in these im- 
ages, we are able to constrain how DAOPhot 's ability to do 
PSF photometry degrades as a function of local crowding 
conditions. To do this we have divided the image up into 
200 pixel by 200 pixel regions, and select only those ob- 
jects that allframe classifies as stars in both runs. We 
count the total number of such objects in this region, as 
well as the total number of "bright" objects in this region, 
where we define "bright" as the brightest 3 magnitudes 
of objects. We calculate the CTmcan from the interquartile 
range of AM for the bright objects, and plot this against 
the total number of stars in the bin. Wc normalize this 
by the area of the box, yielding the local number of stars 
per pixel, and then multiply by the averaged FWHM^ 
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of the two images, yielding the approximate number of 
stars per seeing disk. We fit a line to the relationship of 
AM vs number of stars per FWHM^. These results are 
summarized in Table 1151 We show the plots for the i — 
band data in Figure [H Extrapolation back to an empty 
field (number of stars = 0) yields numbers that are very 
close to the SRD requirement on photometric accuracy. 

14.2. Photometric Depth 

We select stars on an algorithm-by-algorithm basis, 
and find the peak of the star count histograms are the 
same for both daophot and allfrcune, approximately 
r = 20.5, g = 21.0, i = 20.2 for run 4207. However, 
allf rame finds approximately 1.5 times the total num- 
ber of objects in the g-band data, 1.3 in the ?'-band, and 
1.4 in the j-band. This is due to allframe's ability to 
resolve and photometer blended neighbors that contami- 
nate an object's Sharp-ness in daophot, as well as its ex- 
tra photometric depth. Table [TBI characterizes the depth 
per run and passband. For both algorithms, we list the 
peak of the histogram (Mmax), the magnitude bin below 
which 95% of the stars are contained (M95), and the bin 
below which 99% of the stars are contained (Mgg). Us- 
ing Mgg as our proxy, allf rame accurately photometers 
objects nearly a magnitude deeper than in daophot in 
the 5-band, 0.3 magnitudes in the r-band, and 0.5 mag- 
nitudes in the «-band. This is a remarkable improvement 
considering that we only have 2 images per passband to 
work with. The fact that we can combine the constraints 
from images in different filters into a global analysis al- 
lows us to make such improvements in depth. 

Figure [5] shows a r vs. g — r color-magnitude diagram 
(CMD) of all stars in the SDSS images containing M2. 
We have not selected against field stars, which contami- 
nate the cluster CMD. For each algorithm, we query for 
all clustered objects that were classified as stars in both 
runs and in both passbands to yield the final ensembles 
of points. Allf rame finds 1.7 times the number of stars 
as daophot. We plot the averaged magnitudes and colors 
of the objects, as well as typical error bars on each point 
in 8 magnitude bins. 

14.3. Conclusions from Study of M2 

The allf rame analysis has shown that it is an encour- 
aging precursor to LSST's envisioned Deep Detection 
Pipel ine ensemble analysis of imaging data fiRoat et al.l 
|2005() . We are able to use all images of a given part of the 
sky to attain extra depth and precision in the measure- 
ments of all objects in the field. Potential improvements 
to this process include regeneration of the PSF during 
the ensemble analysis, as well as characterization of ex- 
tended objects. 

15. PROCESSING TIME AND SCALABILITY 

During processing, we recorded the total elapsed time 
to run each algorithm on all images. However, dur- 
ing testing we noticed severe degradations in perfor- 
mance during periods of heavy disk access. This is a 
known problem with the Redundant Array of Indepen- 
dent Drives (RAID) controller on the host machine, and 
makes the absolute numbers in this section inaccurate. 
The relative numbers are likely to be less affected. 

We do not have information for DoPhot on run 4207 
because the file containing the times for this run was 



corrupted. We emphasize that the DADPhot results are 
not entirely localizable to the internal algorithms, but are 
also due to inefficiencies in our controlling Perl scripts 
(Section [5]) . We fit the trend of processing time with 
the number of detections, and present these results in 
Table flTl SExtractor is the fastest algorithm, with ver- 
sion 2.3.2 slightly faster than version 2.4.4, primarily due 
to the overhead in calculating windowed quantities in 
the latter. There appears to be a minimum threshold of 
at least 4 seconds necessary for SExtractor to process 
an individual stitched image regardless of the number of 
detections found, due to overhead associated with the 
reading and writing of data products. DAOPhot shows a 
significant trend with number of detections and has the 
steepest scaling laws. The DoPhot entry in Table [17] is 
a bit misleading, as DoPhot tends to be relatively insen- 
sitive to the number of objects ultimately detected in 
the image. This suggests that much of the processing 
time is spent on common-mode items such as the PSF 
generation. 

15.1. Additional Testing 

In an effort to eliminate the influence of the RAID 
controller, we also ran time trials on a new computer. 
We selected four images (two from each run) covering the 
range of total detections per image found by SExtractor 
in the r-filter. The "stitched" images are approximately 
2k X 4k in size. We decided to examine the scaling of 
resource usage with image size by chopping each image 
into a 2k a; 2k image. We also produce an LSST-sized 
image by placing a copy of each image next to itself to 
yield a 4k x 4k image. We store a copy of each image 
with a variety of bit depths to determine how this might 
effect SExtractor's behavior. We store a copy of each 
image as 16 and 32-bit integers (BITPIX=16,32), and as 
32 and 64-bit floats (BITPIX=-32,-64). In summary, we 
have 4 images with different numbers of objects; we have 
3 copies of each image in different sizes; and we store 
each of these with 4 different bit depths. In total, this 
yields 48 different conflgurations. 

Each of these images was SExtracted 50 times in a 
row to determine the average elapsed time per image, 
averaging over any extraneous system load. SExtractor 
was run while there were no other tasks queued on the 
machine for the duration of each run. We monitored the 
memory usage of each process as a function of time by 
scanning the flle /proc/PID/status every half second. 
We extract the values VmSize and VmRSS. VmSize is the 
total amount of memory required by this program, and 
VmRSS is the "Resident Set Size" (the amount actually 
in memory at a given moment). We extracted the total 
processing time by using the executable /usr/bin/time 
and summing the user CPU and system CPU times - 
each process had 98% or greater of the CPU. Table [18] 
lists the results of these trials. 

We flrst examine the proflling as a function of image 
bit depth. The maximum memory used by SExtractor 
is not a function of image bit depth for a given-sized 
image. This suggests that SExtractor translates an im- 
age into a "native" bit depth before processing. The 
total processing times for BITPIX of 16, 32, and -32 are 
very similar; the BITPIX = 64 images take on average 
10% longer to process, suggesting signiflcant overhead in 
translating from 64-bit images. We restrict our analysis 
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henceforth to 32-bit float images. 

We next look at the memory consumed as a function 
of time for a given run. Since we only sample the mem- 
ory usage in 0.5 second intervals, this will be somewhat 
poorly determined for the short analyses. We choose to 
make representative plots using the last image in Ta- 
ble [ISl Figure [10] shows the average memory usage as 
a function of time for the 3 image sizes. Note that the 
total processing time shown here can be up to 0.5 sec- 
onds smaller than the values listed in Table [TH] due to 
our coarse sampling. 

It is interesting to note the memory consumption pro- 
files generally differ due to the different processing times, 
but the maximum memory used does not scale directly 
with the image size or the total number of objects. The 
memory requirements grow only marginally more expen- 
sive, suggesting that SExtractor undertakes an effective 
degree of intelligent memory management. For exam- 
ple, the 4k X 4k image consumes less than twice as much 
memory as the 2k x 4k image. 

We next examine the total processing time as a func- 
tion of the number of objects in the image. These data 
are plotted in Figure [H] We plot the data from the 2k 
X 2k images as circles, 2k x 4k as squares, and 4k x 4k 
as triangles. A linear regression yields the relationship 
y = 0.5468 x + 0.0007. Comparing this to the entries in 
Table [17] is instructive. The zero-point processing time 
of 0.5 seconds is much shorter than previous results of 
~ 4 seconds, almost certainly due to the aforementioned 
RAID issues impeding disk I/O. The slope is similar : 
every ~ 1300 objects being measured adds an additional 
second of processing time. We regard these tests on this 
machine to yield the most reliable timing results. 

15.2. Processing Time Conclusions 

SExtractor version 2.3.2 was the fastest of these algo- 
rithms. However, with slightly longer processing time we 
gain a considerable amount of accuracy in the position 
and shapes of detected objects by using the "windowed" 
parameters from SExtractor 2.4.4. 

Disk access is a fundamental issue that can significantly 
impede image processing tasks. 

The timing tests in Section [1 5 . II produce the most reli- 
able absolute numbers. If we assume that the LSST focal 
plane is populated with 4k x 4k devices, than we expect 
that a single detector may be photometered in (0.5 s) * 
(2.8 GHz) = 1.4 GHz s, with an additional overhead of 
1.4 GHz s for every 1300 objects in the image. We have 
not tested how these numbers scale with processor speed. 

16. SUMMARY OF RESULTS 
16.1. Star/Galaxy Separation 

Each package undertakes some measure of object clas- 
sification. In all cases, the benchmark profile is the PSF. 
DADPhot and DoPhot compare each object to the PSF 
profile. SExtractor compares the width of each object 
with the input PSF FWHM. In comparison. Photo com- 
pares the flux measured using the PSF to the flux from 
galaxy model fits. 

Both DoPhot and SExtractor fared poorly compared 
to DADPhot and Photo (Tables [2] and [3]). However, 
SExtractor has the option to use neural-network filters 
to enhance its performance. DADPhot does a good job 



at object classification, but does not explicitly compute 
object moments. Objects where DADPhot and Photo dis- 
agree tend to be drawn from the stellar locus (Figure S]). 

Photo is the most advanced package in this task, with 
SExtractor having the most potential for improvement 
through add-on software like EyE. 

16.2. Photometry 

Both DADPhot and Photo are able to satisfy LSST's sci- 
ence requirements on photometric accuracy (0.005 mag- 
nitudes unless precluded by photon statistics) for aper- 
ture measurements only. This is realized in the g, r, 
and z-band datasets. PSF photometry is unable to reach 
this accuracy, and consistently falls short by a factor 
of ~ 2 — 3. DADPhot provides marginally better results 
than Photo in both aperture and PSF photometry in our 
normal analysis. DoPhot consistently under-performs in 
both aperture and PSF photometry. SExtractor pro- 
vides adequate aperture photometry, but does not yet 
have the capability to easily build and use a PSF model. 
These results are summarized in Table 2] (for Photo- 
selected stars) and Table|8](for algorithm-selected stars). 

The additional scatter in the PSF magnitudes can be 
traced back to inadequate aperture corrections to the 
PSF flux. We highlight that the determination of this 
quantity, as well as its spatial variation across an image, 
is a crucial issue in LSST algorithm development. 

From our analysis of globular cluster M2, we find that 
DADPhot is able to provide PSF magnitudes in a crowded 
field with an accuracy similar to a sparse field analysis. 
A stacked analysis of the data using allf rsune yields an 
improvement of approximately 25% (Table I14p in pho- 
tometric accuracy, and a passband-dependent increase 
in photometric depth (Table [T6|) . We find a marginal 
degradation in photometric accuracy with local crowd- 
ing conditions (Table [TS]) . Allframe is able to maintain 
2% accuracy in r-band PSF photometry in crowding of 
up to 0.12 stars per PSF FWHM^ (-- 880 stars arcmin~i 
in 0.7"seeing). 

16.3. Shape Measurements 

SExtractor and Photo are the only packages that pro- 
vide reliable estimates of object shapes, using second mo- 
ment analysis. Photo is also the only package that also 
fits galaxy models (exponential, de Vaucouleurs) to each 
object. SExtractor version 2.3.2 uses isophotal second 
moments, which degrade rapidly as a function of mag- 
nitude compared to Photo's adaptive second moments 
(e.g. left panel of Figure O. These measurements should 
not be used to measure the shapes of stars. SExtractor 
versions 2.4.4 and greater use "windowed" second mo- 
ments that yield ellipticities comparable to Photo's (e.g. 
right panel of Figure [6]). Photo and SExtractor 2.4.4's 
stellar ellipticity measurements are extremely consistent, 
their differences having an RMS of 0.001-0.004 (TableE]). 
This is more than a factor of 10 smaller than LSST's sci- 
ence requirement that the median of the distribution be 
no larger than 0.04, indicating that the algorithmic con- 
tribution to the stellar ellipticity distribution should be 
negligible. 

16.4. Centroiding 

By comparing the calculated x,y centroids of objects to 
Photo's centroids, we find very strong systematic trends 
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in isophotal centroiding accuracy as a function of magni- 
tude for SExtractor version 2.3.2 (top panel of Figure[7l 
Table [TIT) . The windowed centroids in SExtractor ver- 
sion 2.4.4 and greater remedy this systematic (bottom 
panel of Figure [7]). The centroiding RMS at the bright 
end (compared to Photo) for most algorithms is 1/100 
the PSF FWHM. An algorithm-to-algorithm compari- 
son yields a typical centroiding RMS of better than 1/200 
the FWHM, with Photo the clear outlier due to its ab- 
solute astrometry corrections (Tables [T^ . 

The LSST relative astrometry requirement of 0.01" is 
not likely to be violated in software. The absolute as- 
trometry requirements of 0.05" may require corrections 
similar to Photo's. 

16.5. Summary 

The one area where current algorithms do not clearly 
exceed the constraints set out in LSST's SRD is in photo- 
metric accuracy. Photo and DAOPhot are able to deliver 
the requisite quality, but only in aperture photometry, 
and then just at the threshold of acceptability. Advances 
in PSF modeling and in wide-field aperture corrections 
and sky subtraction are likely needed to ensure that the 
software can deliver on the promise of LSST. 

To summarize Photo's advantages : Its aperture pho- 
tometry meets the LSST science requirements; its PSF 
photometry is as good as DAOPhot; it is reliably able to 
discriminate stars from galaxies; it is the only algorithm 
that does galaxy model fitting; the 5-band simultaneous 
photometry is very similar to the envisioned LSST Deep 
Detection analysis; and its star/galaxy deblender is ro- 
bust under a variety of conditions. The disadvantages 
of Photo are : it is not very flexible with respect to the 
format of input data, only operating on SDSS images; 
the code as designed is not very portable; the deblender 
is not designed for crowded fields. 

To summarize DAOPhot 's advantages : Its PSF pho- 
tometry is the best among the algorithms considered 
here; star/galaxy separation is surprisingly robust; it 
provides the best solution for point source photometry 



in crowded fields; allf ramie is also a useful Deep Detec- 
tion precursor algorithm. Its disadvantages are : it is 
relatively slow, and it does no galaxy characterization. 

To summarize DoPhot's advantages : It is easily 
pipelined, and will take almost any input data. Its dis- 
advantages are : its PSF does not vary spatially, and 
it returns the poorest results with respect to both pho- 
tometry and astrometry (excluding SExtractor isopho- 
tal centroids). 

Finally, to summarize SExtractor's advantages : It is 
very fast and the code is very portable; its aperture pho- 
tometry returns acceptable results; its windowed shapes 
are as good as Photo's adaptive shapes; the windowed 
centroids are as good as PSF centroids; the dcblcnding 
model is very extensible; and the inclusion of neural net- 
working for object classification is novel and potentially 
very powerful. Its disadvantages are : there is no easily 
accessible PSF modeling, and the isophotal shape and 
positional measurements may be significantly biased at 
faint magnitudes. 
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Clustering 2.4 Million Points 



Number of Points per Page 

Fig. 1. — Run time for clustering 2.4 million points as a function of leaf size in the internal lean-tree database used by OPTICS. Note the 
y-axis in units of 10^ seconds. 
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Fig. 2. — Distribution of the Sharp parameter for DAOPhot reductions of r-band data from run 4207. The left figure shows objects that 
Photo classifies as stars, and the right figure objects that Photo classifies as galaxies. The data are split by magnitude into 4 bins. The 
dashed line shows the cumulative fraction. Note the distribution is symmetric around value 0.0 for stars and biased towards values greater 
than 0.0 for galaxies. 
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Fig. 3. — Distribution of the CLASS_STAR parameter for SExtractor reductions of r-band data from run 4207. The top panel shows 
objects that Photo classifies as stars, and the bottom panel objects that Photo classifies as galaxies. The data are split by magnitude into 
4 bins. The dashed line shows the cumulative fraction. Note the highly skewed distributions. 



dao_pho_3437_ccd : 14.0 < r < 20. ' 
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Fig. 4. — These panels show g — r, r — i diagrams (derived from the Photo magnitudes) for objects with 14 < r < 20. These are the 
subset of objects that had detections in g, r, and i in DAOPhot and Photo from run 3437. In the upper left is the set of objects that both 
DAOPhot and Photo called stars; in the upper right, DAOPhot classified as a star and Photo classified as a galaxy; in the lower left, DAOPhot 
classified as a galaxy and Photo classified as a star; in the lower right, both algorithms classified as galaxies. 
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dao34 3 7 dao4 2 7 r star ap 





dao34 3 7 dao4 2 7 r star psf 
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Fig. 5. — Figure described in Section Il2l for DADPhot's r-band photometry of stars. Figure on the left is for aperture photometry, on the 
right is PSF photometry. 




Fig. 6. — Comparison of run 3437 r-band galaxy ellipticity measurements in SExtractor and Photo, el is plotted as green triangles, 
and e2 as red squares. In each figure, the 4 panels are for data in different i — band magnitude bins, and compare the shape measured in 
SExtractor on the x-axis, and Photo on the y-axis. The left figure shows results from SExtractor 2.3.2 and the right figure SExtractor 
2.4.4. The lines show the best fits given in Table [TOl dashed for el and solid for e2. 
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Fig. 7. — Differences in measured stellar positions between SExtractor and Photo plotted as a function of magnitude for z-band data 
from run 4207. The x coordinate is perpendicular to the scan direction in SDSS data. The left panel shows these results for SExtractor 
2.3.2, while the right panel shows the results for SExtractor 2.4.4. These particular plots were chosen to demonstrate the improvements in 
centroiding between SExtractor versions 2.3.2 and 2.4.4. 



AUframe Analysis of M2; r-band 
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Fig. 8. — CTAA/ plotted as a function of local crowding conditions, derived from allf rame analysis of globular cluster M2, for the r— band 
data. We divided the image up into multiple regions and for each derived the width of the AM distribution from the brightest 3 magnitudes 
of stars. We normalized the number of all stars in each region by the area of the region and the average FWHM of the two images. The 
X-axis reflects the crowding conditions, and corresponds to the total number of stars per seeing disk. 
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Daophot 



Allframe 




Fig. 9. — Color-magnitude diagram (CMD) of M2 reconstructed from daophot and allframe analysis. All clustered objects classified by 
each algorithm as stars in both runs and in both the r and 3-bands were used. We also plot typical error bars in 8 magnitude bins. The 
allframe CMD contains 70% more points than the daophot CMD, and reaches approximately 0.3 magnitudes deeper in the r-band. 



SExtractor Memory Profile 
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Fig. 10. — Detailed look at the average memory required by SExtractor as a function of time. The solid lines correspond to VmSize, 
while the dashed lines correspond to VmRSS. 




Fig. 11. — Plot of total SExtractor processing time as a function of number of detections in the image. The red circles are from the 2k x 
2k images, blue squares from the 2k x 4k images, and green triangles from the 4k x 4k images. A joint fit to all the data is shown in black, 
with the functional form y = 0.5468 x + 0.0007. 
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TABLE 1 

DAOPhot "Sharp" Distribution for Photo-SELECTED Stars 



Filter 


Mean 


RMS 


u 


0.004 


0.096 


g 


0.001 


0.062 


r 


0.000 


0.043 


i 


0.003 


0.045 


z 


0.003 


0.081 



Note. - Distribution of DAOPhot "Sharp" parameters for objects classified by Photo as stars. We find these distributions by combining 
all data from runs 3437 and 4207. These numbers were derived from the .icr clipped distribution of Sharp-ness parameters for all DAOPhot 
measurements that were clustered with objects Photo classified as stars between r = 14*'' and r = 20*'' magnitude. DAOPhot-selected 
stars are subsequently defined as anything having a sharpness within ±3 RMS of the mean. DAOPhot-selected galaxies are objects with 
a sharpness larger than +3 RMS of the mean; objects with sharpness smaller than —3 RMS of the mean are likely cosmic rays or other 
defects. 



TABLE 2 

Object Classification; Algorithm vs. Photo 



Algorithm 


Run 


Filter 


S-S 


S-G 


G-S 


G-G 


DAOPhot 


3437 


g 


0.93 


0.01 


0.02 


0.04 






r 


0.82 


0.01 


0.05 


0.12 






i 


0.81 


0.01 


0.05 


0.13 




4207 


g 


0.95 


0.01 


0.01 


0.03 






r 


0.87 


0.01 


0.02 


0.09 






i 


0.85 


0.02 


0.03 


0.10 


DoPhot 


3437 


g 


0.93 


0.04 


0.00 


0.03 






r 


0.87 


0.07 


0.00 


0.05 






i 


0.83 


0.13 


0.00 


0.04 




4207 


g 


0.96 


0.01 


0.00 


0.03 






r 


0.91 


0.05 


0.00 


0.04 






i 


0.87 


0.09 


0.00 


0.03 


SExtractor 


3437 


g 


0.35 


0.00 


0.59 


0.06 






r 


0.57 


0.00 


0.28 


0.15 






i 


0.56 


0.00 


0.25 


0.18 




4207 


g 


0.90 


0.00 


0.05 


0.05 






r 


0.83 


0.01 


0.02 


0.14 






i 


0.74 


0.01 


0.09 


0.16 



Note. - The fraction of total clustered objects brighter than 21st magnitude classified by the algorithm and Photo as a star (S-S); classified 
by the algorithm as a star and Photo as a galaxy (S-G); classified by the algorithm as a galaxy and Photo as a star (G-S); and classified by 
both the algorithm and Photo as a galaxy (G-G). This table indicates the degree of agreement between algorithms for a given set of data. 



TABLE 3 

Object Classification; Algorithm vs. Itself 



Algorithm 


Filter 


S-S 


S-G 


G-S 


G-G 


DAOPhot 


g 


0.77 


0.06 


0.06 


0.12 




r 


0.65 


0.07 


0.06 


0.22 




i 


0.68 


0.06 


0.05 


0.20 


DoPhot 


g 


0.92 


0.03 


0.01 


0.04 




r 


0.93 


0.02 


0.01 


0.03 




i 


0.93 


0.02 


0.02 


0.04 


Photo 


g 


0.94 


0.02 


0.00 


0.04 




r 


0.90 


0.01 


0.00 


0.08 




i 


0.8(i 


0.02 


0.01 


0.11 


SExtractor 


g 


0.23 


0.58 


0.01 


0.17 




r 


0.43 


0.35 


0.01 


0.21 




i 


0.45 


0.24 


0.02 


0.29 



Note. The fraction of total clustered objects brighter than 21st magnitude classified by the algorithm in both runs as a star (S-S); 
classified as a star in run 4207 and galaxy in 3437 (S-G); classified as a galaxy in run 4207 and star in 3437 (G-S); and as a galaxy in both 
runs (G-G). This table indicates the degree of agreement within a given algorithm for a given set of objects. 
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TABLE 4 

Width of AM Distribution For Photo-SELECTED Stars 



AlgoTithm 


Magnitude 


u 





r 


i 


z 


DAOPhot 


Aperture 


0.027 


0.006 


0.006 


0.007 


0.015 




PSF 


0.032 


0.018 


0.018 


0.017 


0.017 


DoPhot 


Aperture 


0.024 


0.009 


0.008 


0.008 


0.011 




PSF 


0.031 


0.026 


0.031 


0.037 


0.031 


Photo 


Aperture 


0.027 


0.007 


0.006 


0.007 


0.015 




PSF 


0.029 


0.019 


0.019 


0.021 


0.019 


SExtractor 2.3.2 


Aperture 


0.057 


0.009 


0.008 


0.010 


0.035 




PSF 












SExtractor 2.4.4 


Aperture 


0.057 


0.009 


0.008 


0.010 


0.035 




PSF 













Note. — Characteristic widths of AM , evaluated 1 magnitude below the brightest non— sat urat ed object, representing the repeatability 
of photometric measurements of objects classified by Photo as stars, as described in Section 1101 Measurements compatible with LSST's 
science requirements (0.007 magnitudes) are highlighted in bold. 



TABLE 5 

Width of Stellar r-band AM Distribution Algorithm to Algorithm: Aperture Magnitudes 





DAOPhot 


DoPhot 


Photo 


SExtractor 2.^. 4 


DAOPhot 




0.011 


0.009 


0.009 


DoPhot 


0.007 




0.007 


0.010 


Photo 


0.006 


0.005 




0.008 


SExtractor 2.4.4 


0.007 


0.008 


0.005 





Note. - Comparison of the characteristic width of AM at the bright end of the distribution derived from comparisons of different algorithms 
on the same images. The upper triangular matrix reflects r-band aperture measurements of Photo-selected stars seen in run 3437, and the 
lower triangular for run 4207. 



TABLE 6 

Width of Stellar r-band AM Distribution Algorithm to Algorithm; PSF Magnitudes 





DAOPhot 


DoPhot 


Photo 


DAOPhot 




0.033 


0.018 


DoPhot 


0.031 




0.032 


Photo 


0.018 


0.025 





Note. - Comparison of the characteristic width of AM at the bright end of the distribution derived from comparisons of different algorithms 
on the same images. The upper triangular matrix reflects r-band aperture measurements of Photo-selected galaxies seen in run 3437, and 
the lower triangular for run 4207. 



TABLE 7 

Width of AM Distribution For Photo-SELECTED Stars; PSF vs. Aperture Magnitudes 



Algorithm 


Run 


u 


9 


r 


i 


z 


DAOPhot 


3437 


0.021 


0.013 


0.013 


0.016 


0.021 




4207 


0.021 


0.016 


0.014 


0.019 


0.024 


DoPhot 


3437 


0.022 


0.018 


0.024 


0.028 


0.030 




4207 


0.017 


0.018 


0.027 


0.034 


0.024 


Photo 


3437 


0.021 


0.014 


0.012 


0.013 


0.015 




4207 


0.020 


0.017 


0.014 


0.015 


0.018 



Note. — C hara cteristic widths representing the repeatability of photometric measurements of objects classified by Photo as stars, as described 
in Section [TOl This table compares aperture vs. PSF magnitudes, and is primarily sensitive to spatial variations in the aperture corrections 
to PSF photometry. 



22 



Becker et al. 



TABLE 8 

Width of AM Distribution For Algorithm-Selected Stars 



Algorithm 


Magnitude 


u 


9 


r 


i 


z 


DAOPhot 


Aperture 


0.017 


0.007 


0.007 


0.008 


0.006 




PSF 


0.030 


0.020 


0.018 


0.016 


0.017 


DoPhot 


Aperture 


0.017 


0.009 


0.009 


0.009 


0.007 




PSF 


0.027 


0.026 


0.032 


0.036 


0.027 


Photo 


Aperture 


0.027 


0.007 


0.006 


0.007 


0.015 




PSF 


0.029 


0.019 


0.019 


0.021 


0.019 


SExtractor 2.4.4 


Aperture 


0.019 


0.007 


0.009 


0.010 


0.011 




PSF 













Note. - We repeat the analysis summarized in Tablc|4]but instead use the algorithm's classification scheme instead of Photo's (Section[9ll. 
Objects must be classified as stars in both runs. Photo results are the same as in Table |3] 



TABLE 9 

Comparison of Stellar t-band Ellipticities 



Ellipticity 


Algorithm 


Run 


RMS 


Intercept 


Slope 


el 


SExtractor 2.3.2 


3437 


0.026 


0.020 


0.406 


e2 




3437 


0.021 


-0.033 


0.447 


el 




4207 


0.034 


-0.051 


0.393 


e2 




4207 


0.030 


0.014 


0.420 


el 


SExtractor 2.4.4 


3437 


0.002 


-0.003 


2.046 


e2 




3437 


0.001 


-0.000 


2.060 


el 




4207 


0.004 


-0.016 


2.141 


e2 




4207 


0.002 


0.001 


2.181 



Note. - Comparison of Photo and SExtractor r-band ellipticity measures for Photo-selected stars with 14 < r < 20. We fit a line to the 
relationship and evaluate the RMS perpendicular to the principal axis. SExtractor 2.3.2 uses "isophotal" shape measures, and SExtractor 
2.4.4 "windowed" shape measures. 



TABLE 10 

Comparison of Galaxy t-band Ellipticities 



Ellipticity 


Algorithm 


Run 


RMS 


Intercept 


Slope 


el 


SExtractor 2.3.2 


3437 


0.036 


0.005 


0.987 


e2 




3437 


0.037 


-0.002 


0.976 


el 




4207 


0.037 


-0.001 


0.976 


62 




4207 


0.038 


0.004 


0.973 


el 


SExtractor 2.4.4 


3437 


0.016 


0.005 


1.834 


e2 




3437 


0.015 


-0.004 


1.848 


el 




4207 


0.016 


-0.001 


1.825 


c2 




4207 


0.017 


0.002 


1.842 



Note. — Same as Table |9l but for Photo-sclocted galaxies. 



TABLE 11 

Centroiding Offsets (in Pixels) for Stars as a Function of Magnitude 



Algorithm 


Run 


Filter 


MO 




Bx 


Cx 


A.y 


By 


Cy 


DAOPhot 


3437 


u 


16.41 


0.000 


0.002 


0.000 


-0.001 


0.000 


-0.000 






g 


15.29 


0.002 


-0.001 


0.000 


-0.003 


0.005 


-0.001 






r 


14.79 


-0.000 


0.000 


-0.000 


-0.005 


0.004 


-0.001 






i 


14.61 


-0.000 


0.000 


-0.000 


-0.003 


0.003 


-0.001 






z 


14.44 


0.003 


-0.003 


0.000 


-0.001 


0.002 


-0.000 



Note. — Table [TT] is published in its entirety in the electronic edition of the PASP. A portion is shown here for guidance regarding its form 
and content. 

- Results of the analysis described in Section [12] for Photo-selected stars. Coefficients subscripted x are for the x-axis offsets, y are for the 
y-axis. This analysis tests systematics in centroiding as a function of magnitude. 



Photometry Comparison 



23 



TABLE 12 

r-BAND CeNTROIDING RMSa; (iN PiXELS) FOR PhotO-SELECTED STARS; ALGORITHM VS ALGORITHM 





DAOPhot 


DoPhot 


Photo 


SExtractor a.^. 4 


DAOPhot 




0.008 


0.029 


0.007 


DoPhot 


0.011 




0.024 


0.004 


Photo 


0.030 


0.021 




0.024 


SExtractor 2.4.4 


0.011 


0.007 


0.021 





Note. - Width of the stellar positional offset distribution evaluated 1 magnitude below the brightest object. The upper triangular matrix 
reflects r-band measurements of Photo-selected stars in run 3437, and the lower triangular for run 4207. 



TABLE 13 
Comparison of Photometric Depth 



Run 


Filter 


Magnitude 


DAOPhot 


DoPhot 


Photo 


SExtractor 


Photo* 


3437 


u 




20.36 


20.91 


22.00 


19.99 


19.99 






Mgs 


21.63 


21.81 


22.54 


21.98 


22.16 






M99 


21.99 


21.99 


23.45 


23.25 


23.07 




g 




20.68 


22.34 


22.55 


19.44 


20.27 






M95 


21.71 


22.75 


22.96 


21.10 


22.75 



Note. — Table [T3l is published in its entirety in the electronic edition of the PASP. A portion is shown here for guidance regarding its form 
and content. 

- Comparison of the photometric depths of each algorithm. We use three numbers to characterize this quantity. T>Amax represents the 
maximum of the measured star count histogram; M95 is the bin below which 95% of the stars are contained; Mg9 is the bin below which 99% 
of the stars are. PSF magnitudes are used to compare DAOPhot, DoPhot, and Photo. For SExtractor, we use Photo's aperture magnitudes, 
listed as Photo* , for comparison. 



TABLE 14 

Width of AM Distribution For Algorithm-Selected Stars in M2 



Algorithm 


Magnitude 


u 


9 


r 


i 


z 


daophot 


Aperture 


0.015 


0.013 


0.036 


0.029 


0.016 




PSF 


0.024 


0.032 


0.018 


0.015 


0.016 


allf rame 


Aperture 


0.026 


0.012 


0.036 


0.031 


0.017 




PSF 


0.018 


0.020 


0.014 


0.011 


0.011 


allf rame 


Aperture 


0.039 


0.024 


0.046 


0.045 


0.023 




PSF 


0.020 


0.028 


0.018 


0.014 


0.012 



Note. - We repeat the analyses summarized in Section 1101 for globular cluster M2. We restrict our analyses to the algorithms daophot 
and allf rame. The first set of allf rame results correspond to objects classified by daophot as stars. The second set correspond to objects 
classified by allframe as stars. 



TABLE 15 

Width of AM Distribution in M2 as a Function of Crowding 



Filter 


Intercept 


Slope 


u 


0.020 


0.121 


g 


0.018 


0.134 


r 


0.008 


0.103 


i 


0.008 


0.077 


z 


0.007 


0.050 



Note. - We repeat the analyses summarized in Section 1101 for globular cluster M2, this time plotting AM as a function of crowding 
conditions in the image. The r-band data are plotted in Figure |8] 
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TABLE 16 

Comparison of Photometric Depth in M2 



Bun 


Filter 


Magnitude 


daophot 


allf rame 


3437 


u 




20.25 


20.65 






M95 


21.84 


22.44 






M99 


22.24 


22.84 




g 


M95 


20.80 
22.14 


20.80 
22.91 



Note. — Table [TB] is published in its entirety in the electronic edition of the PASP. A portion is shown here for guidance regarding its form 
and content. 

- Comparison of the photometric depths of each algorithm. We use three numbers to characterize this quantity. Mmax represents the 
maximum of the measured star count histogram; M95 is the bin below which 95% of the stars are contained; M99 is the bin below which 
99% of the stars are. PSF magnitudes are used in this comparison. 



TABLE 17 

Algorithm Processing Time as a Function of The Number of Sources 



Algorithm 


SDSS Run 


Slope 


y-Intercept 






(sec/#Det) 


(sec) 


DAOPhot 


3437 


0.260 


10 




4207 


0.090 


170 


DoPhot 


3437 


0.025 


101 




4207 






SExtractor v2.3.2 


3437 


0.010 


4.2 




4207 


0.001 


4.3 


SExtractor v2.4.4 


3437 


0.001 


4.5 




4207 


0.001 


4.7 



Note. - Scaling of processing time with the number of sources in the images. We determine the time it takes each algorithm to process 
one image versus the number of sources detected in that image. We find the linear trend with source number, listing here the slope and 
intercept. 



TABLE 18 
SExtractor PROFILING 



Image NObj 


Size 


BITPIX 


VmSize kB 


VmR SS kB 


Time (s) 


RMS (s) 


r-003437.0170 750 


2k X 2k 


16 


26134 (3.1) 


16552 (2.0) 


0.95 


0.01 






32 


25943 (1.5) 


16435 (1.0) 


0.98 


0.02 






-32 


26015 (1.5) 


16451 (1.0) 


0.97 


0.01 






-64 


26107 (0.8) 


16488 (0.5) 


1.09 


0.02 


1619 


2k X 4k 


16 


26724 (1.6) 


16698 (1.0) 


1.92 


0.02 



Note. — Table [TSl is published in its entirety in the electronic edition of the PASP. A portion is shown here for guidance regarding its form 
and content. 

- Average memory usage and processing time of SExtractor as a function of image size, bit depth, and number of sources. VmSize and 
VmRSS show the average maximum memory used; in parenthesis is this number as a fraction of the image size. The average total processing 
time and its RMS are also listed. 
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17. APPENDIX 
17.1. The SDSS Photometric Pipeline: Photo 

The SDSS photometric pipehne Photo contains a complete suite of data reduction tools that take the raw data 
stream, apply reduction and calibration stages, and extract photometry from the calibrated images. Because the 
images we are using have been pre-processed by Photo, we expect that Photo has a distinct advantage in the quality 
of its photometric measurements. 

In Photo, the data stream from each CCD (drift-scanning results in an "infinitely" long narrow image) is divided into 
an overlapping series of lO'by 13'frames for ease of processing. A Photo module named f rsunes processes each of these 
separately. However, in order to ensure continuity along the data stream, certain quantities need to be determined on 
timescales up to the length of the imaging run. The astrometric and photometric calibrations certainly fall into that 
category; in addition, a Photo module named the postage-stsmip pipeline (PSP) calculates a global sky for a field, 
flat-field vector, bias level, and the PSF. Once these are provided, a frames nm can be trivially parallelized. 

17.1.1. The Point Spread Function in Photo 

Even in the absence of atmospheric inhomogeneities the SDSS telescope delivers images whose FWHMs vary by up 

to 15% from one side of a CCD to the other; the worst effects arc seen in the chips furthest from the optical axis. Since 
the atmospheric seeing is not constant in time, the delivered image quality is a complex two-dimensional function. 
The description of the PSF is critical for accurate PSF photometry, for star/galaxy separation and for studies that 
measure the shapes of non-stellar objects. 

The SDSS imaging point spread function (PSF) is modeled heuristically in each band using a Karhunen-Loeve 
(K-L) transform. In particular, using stars brighter than roughly 20th magnitude, the stellar images from a series of 
five frames are expanded into eigenimages and the first three terms arc kept. The variation of the coefficients that 
multiply these terms with position across the chip is described by a low-order polynomial. 

The success of this K-L expansion depends critically on successful selection of PSF stars. In essence, to determine 
the PSF one needs to select stars that look like the PSF, a requirement that results in somewhat convoluted selection 
procedure. 

The selection of PSF stars is done in two steps. In the first crude step stars that are grossly inadequate are 
rejected based on their individual properties (i.e. without considering the overall sample properties). This category 
includes objects that are too faint, those with saturated or cosmic ray pixels, objects with very close neighbors, and 
significantly elongated objects (star/galaxy information is not yet available at this processing stage). In the second 
step the distribution of image size and ellipticity is used to reject stars that significantly deviate (^ 3a or more) from 
the median. Typically about 50% of bright objects (r < 19) survive both rejection steps. 

17.1.2. Object Detection and Measurement in Photo 

Objects in the frame are detected and their properties measured in a four-step process in each band. First, an 

object finder is run to detect bright objects. In each band, the object finder detects pixels that are more than 200(7 
(corresponding roughly to r = 17.5) above the sky noise; only a single pixel need be over this threshold for an object 
to be detected at this stage. These objects are flagged as BRIGHT. The extended power-law wings of BRIGHT 
objects that are saturated are subtracted from the frame. Such stars are marked SUBTRACTED. Then the sky level 
is estimated by median-smoothing the frame image on a scale of approximately 100"; the resulting "local" sky image 
is subtracted from the frame (a global sky determined on an entire frame has already been subtracted). 

Third, objects are found by smoothing the image with a Gaussian fit to the PSF and looking for 5(T peaks over the 
(smootlic;d) sky in each band. After objects are detected, they are "grown" more or less isotropically by an amount 
approximately equal to the radius of the seeing disk. An object is defined as a connected set of pixels that are detected 
in at least one band. All pixels in the object are subsequently used in the analysis in every band, whether or not they 
were originally detected in that band. Photo never reports an upper limit for the detection of an object but, rather, 
carries out a proper measurement, with its error, for each of the varieties of fiux listed below. 

Objects detected in a given band at this stage are fiagged by setting the mask bit BINNEDl in that band. All pixel 
values in these BINNEDl objects are then replaced by the background level (with sky noise added in), the frame is 
rebinned into a 2 x 2 pixel image, and the object finder is run again. The resulting sample is flagged in a similar 
way with the BINNED2 mask, and pixel values in these objects are replaced with the background level. Finally, the 
original pixel data is rebinned in a 4 x4 pixel image, and objects found at this stage are flagged BINNED4. The set 
of detected objects then consists of aU objects with pixels flagged BINNEDl, BINNED2, or BINNED4. 

Fourth, the pipeline measures the properties of each object, including the position, as well as several measures of 
flux and shape, described more fully below. It attempts to determine whether each object actually consists of more 
than one object projected on the sky and, if so, to deblend such a "parent" object into its constituent "children", 
self-consistently across the bands (thus, all children have measurements in all bands). Then it again measures the 
properties of these individual children. Bright objects arc measured twice: once with a global sky and no dcblending 
run - this detection is fiagged BRIGHT - and a second time with a local sky. For most purposes, only the latter is 
useful, and thus one should reject all objects flagged BRIGHT in compiling a sample of objects for study. 

17.1.3. Photometric Measurements in Photo 
There are several magnitude types provided by Photo, and all are measured for all the detected sources. 
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PSF Magnitudes — For isolated stars, which are well described by the PSF, the optimal measure of the total flux is 
determined by fitting a PSF model to the object. In practice, this is done by sinc-shifting the image of a star so that 
it is exactly centered on a pixel and then fitting a Gaussian model of the PSF to it. This fit is carried out on the local 
PSF K-L model at each position as well; the difference between the two is then a local aperture correction, which 
gives a corrected PSF magnitude. Finally, bright stars are used to determine a further aperture correction to a radius 
of 7.4"as a function of seeing. This involved procedure is necessary to take into account the full variation of the PSF 
across the field, including the low signal-to-noise ratio wings. Empirically, this reduces the seeing dependence of the 
photometry to below 0.02 mag for seeing as poor as 2". 

The PSF magnitude errors include contributions from photon statistics and uncertainties in the PSF model and 
aperture correction. Repeat observations show that these errors are probably underestimated by 10%. 

Petrosian Magnitudes — For galaxy photometry, measuring flux is more difficult than for stars, because galaxies do 
not all have the same radial surface brightness profile, and they have no sharp edges. In order to avoid biases, one 
wishes to measure a constant fraction of the total light, independent of the position and distance of the object. To 
satisfy these requirements, the SDSS has adopted a modified form of the iPetrosianI (jl976|) system, measuring galaxy 
fluxes within a circula r aperture whose radius is defined by the shape of the azimuthally averaged light profile (see 
IStoughton et al.ir2002l . for more details). 

Model Magnitudes — Just as the PSF magnitudes are optimal measures of the fluxes of stars, the optimal measure 
of the flux of a galaxy would use a matched galaxy model. With this in mind, the code fits two models to the two- 
dimensional image of each object in each band: a pure de Vaucouleurs profile, and a pure exponential profile. The 
models are convolved with a double- Gaussian fit to the PSF. Residuals between the double-Gaussian and the full K-L 
PSF model are added on for just the central PSF component of the image. 

In order to measure unbiased colors of galaxies, their flux is measured through equivalent apertures in all bands. The 
model (exponential or de Vaucouleurs) of higher likelihood in the r filter is applied (allowing only the amplitude to 
vary) in the other bands after convolving with the appropriate PSF in each band. The resulting magnitudes are called 
model magnitudes. The resulting estimate of galaxy color is unbiased in the absence of color gradients. Systematic 
differences from Petrosian colors are in fact often seen as a result of color gradients, in which case the concept of a 
global galaxy color is somewhat ambiguous. For faint galaxies, the model colors have appreciably higher signal-to-noise 
ratio than do the Petrosian colors. 

17.1.4. Star/Galaxy Separation in Photo 

A simple star-galaxy separator, that works at the 95% confidence level to at least r — 21, is based on a difference 
between psf and model magnitudes: "unresolved" objects are those with this difference smaller than 0.145 mag. This 
separation is done in each band separately, and again globally based on the summed fluxes from all bands in which 
the object is detected. 

Experimentation has shown that simple variants on this scheme, such as defining galaxies as those objects classified 
as such in any two of the three high signal-to-noise ratio bands (namely, g, r, and z), work better in some circumstances. 
However, this scheme occasionally fails to distinguish pairs of stars with separation small enough (< 2 ") that the 
deblender does not split them; it also occasionally classifies Seyfert galaxies with particularly bright nuclei as stars. 

17.1.5. Image Ellipticity 

While the model fits yield an estimate of the axis ratio and position angle of each object, it is useful to have model- 
independent measures of ellipticity. Two further measures of ellipticity are computed by frames, one based on second 
moments, the other based on the ellipticity of a particular isophot. The model fits do correctly account for the effect 
of the seeing, while these two methods do not. 

The first method measures flux- weighted second moments, defined in IStoughton et al] ()2002f ). This method is not 
ideal at low signal-to-noise ratio. A second measure of ellipticity is given by measuring the ellipticity of the 25 mag 
per square arcsec isophot (in all bands). In detail, frames measures the radius of a particular isophot as a function 
of angle and Fourier-expands this function. It then extracts from the coefficients the centroid, major and minor axes, 
position angle, and average radius of the isophot in question. It also reports the derivative of each of these quantities 
with respect to isophot level, necessary to recompute these quantities if the photometric calibration changes. 

17.1.6. The Deblender 

Once objects are detected, they are deblended by identifying individual peaks within each object, merging the list 
of peaks across bands, and adaptively determining the profile of images associated with each peak, which sum to form 
the original image in each band. The originally detected object is referred to as the "parent" object and has the flag 
BLENDED set if multiple peaks are detected; the final set of subimages of which the parent consists are referred to 
as the "children" and have the flag CHILD set. All quantities are measured for both parent and child. For each 
child, parent gives the id of the parent (for parents themselves or isolated objects, this is set to the id of the BRIGHT 
counterpart if that exists; otherwise it is set to —1); for each parent, nchild gives the number of children an object 
has. Children are assigned the id numbers immediately after the id of the parent. Thus, if an object with id 23 is set 
as BLENDED and has nchild equal to 2, objects 24 and 25 will be set as CHILD and have parent equal to 23. 
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The hst of peaks in the parent is trimmed to combine peaks (from different bands) that are too close to each other 
(if this happens, the flag I^EAKS_TOO_CLOSE is set in the parent). If there are more than 25 peaks, only the most 
significant are kept, and the flag DEBLEND_TOO_MANY_PEAKS is set in the parent. 

In a number of situations, the deblender decides not to process a BLENDED object; in this case the object is flagged 
as NODEBLEND. Most objects with EDGE set are not dcblended. The exceptions are when the object is large enough 
(larger than roughly an arcminute) that it will most likely not be completely included in the adjacent scan line either; 
in this case, DEBLENDED_AT_EDGE is set, and the deblender gives it its best shot. When an object is larger than 
half a frame, the deblender also gives up, and the object is fla gged as TOO_LARGE. Other intricacies of the deblending 
results are also recorded in flags (see IStoughton et al.ll2003 . for more details). 

On average, about 15%-20% of all detected objects are blended, and many of these are superpositions of galaxies 
that the deblender successfully treats by separating the images of the nearby objects. Thus, it is usually the childless 
(not BLENDED) objects that are of most interest for science applications. 



The SDSS astrometri c pipel ine, including treatment of chromatic aberration and improved centroiding, is described 
in detail by iPier et all (j2003f ). Of particular relevance here are centroiding corrections that are similar in spirit 
to aperture corrections for psf magnitudes. A centroid correction (the difference in position estimate between an 
approximate quartic method and true centroid) is found using a high S/N PSF estimate, and then applied to low S/N 
objects. This correction may be as high as 1/4 of a pixel and is applied in situ. For this reason, it is expected that 
photo's centroids will not perfectly agree with centroids determined by other algorithms. 



The DAOPhot package contains a set of photometry algorithms primarily designed to do stellar photometry and 
astrometry in crowded fields. The tools are included as either subroutines in the executable daophot or as independent 
executable programs. The programs are typically used in the following groupings : daophot^ and allstar; and 
daomatch, daomaster, montage2, and allf reime. These programs are defined below. 

• daophot : Main executable program. Typically used to find stellar objects, perform aperture photometry, and 
derive a PSF for the image from a selected set of stars. The PSF-building task is the most complex, and is 
highly iterative. No accommodations are made for the measurement of extended sources. 

• allstar : Run in conjunction with daophot. Accepts the results of daophot's photometry and PSF-building 
stages and performs a multiple-profile PSF fit to stars in the image simultaneously, optimally deblending neigh- 
bors and merging detections if they are determined to be the same object. Allstar groups objects for a joint fit 
based upon their proximity, thus does not literally photometer the entire image at once. This program automati- 
cally undertakes an iterative process of merging stars in the input star list based upon a signal-to-noise criterion, 
rejecting bad objects, and re-fitting each group's centroids and brightnesses until all objects have converged (or 
a certain number of iterations are reached). In practice, this package is used to yield the "final" photometry and 
astrometry for a single image. 

• daomatch : If multiple images of a field have been acquired (either in different filters or on different dates), 
daomatch may be used to determine a basic geometric transformation (offset, scaling, and rotation) between the 



• daomaster : Takes the output of daomatch (an ensemble of geometric transformations, one for each science 
image) and performs a joint registration of the star lists, rejecting spurious matches and enforcing a common list 
of stars in all images for the match. The transformations may be of higher order (up to cubic) than in daomatch. 
Daomaster also returns the list of common stars that are present in a user-defined fraction of the images, up to 
a user-defined matching radius, as well as the geometric transformations derived from these stars. 

• montage2 : Takes the transformations from daomaster and makes a stacked image. The user decides which 
percentile from the ensemble of (sky-subtracted) input pixels yields the stacked image (i.e. 0.5 = median). The 
image weights scale as (Depth/FWHM)^. Pixels are resampled using nearest-neighbor interpolation, and the 
resulting images are not to be considered "science grade". This step is typically done after allframe is run, 
where it is used to coadd star-subtracted images to search for faint objects that were originally missed. 

• allframe : Takes the master star list and geometric transformations derived from daomaster and performs 
simultaneous PSF photometry on a given group of objects in the entire stack of images. This package is essentially 
a 3-dimensional version of allstar. Allframe mirrors in many asp ects the envisioned LSST Image Processing 



The executable daophot is designed to be command-line driven, and in fact places the user in a small data processing 
environment. For this reason, is has proven difficult to turn this into an automated pipeline. In particular, the 
generation of the point-spread function in DADPhot is a highly iterative process involving many stages. We have 
chosen to use Perl-language scripts to automate this process (Section 117. 2. 5|) . 

''' We use the following conventions : when referring to DAOPhot as a package we will capitalize the name; when referring to the executable 
daophot we will use lower— case. 
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17.2.1. How DAOPhot is Written 

The DAOPhot package is written in the language FORTRAN. It requires the cf itsio hbraries, as well various as IRAF 
libraries. The code itself is very well documented, and in fact much of what we have learned about how it operates at 
the algorithmic level was derived straight from the FORTRAN code. 

However, the code also contains many hard-wired variables, and thus is not flexible enough for LSST's needs as 
implemented. Two prime examples that caused us difficulties are the maximum number of PSF stars allowed in the 
PSF model (MAXN), which had to be changed in two places in the file psf .f (one apparent, one not), as well as the 
maximum filename length allowed by DAOPhot (including the absolute path to the file), which was hard coded in 
enough places that it was unfeasible to change them all. As a workaround, during actual DAOPhot reductions we made 
a copy of each image in the /tmp/ directory, operated on the file there, and then copied the derived data products 
back into the pipeline workspace. In addition, if one wanted to change other variables such as the maximum image 
size or maximum number of images to reduce, one has to edit a file and then recompile the binaries. 

17.2.2. How DAOPhot is Designed to be Used 

DAOPhot is better described as a toolkit than a pipeline. In fact, it has been designed to be a user-interactive 
environment. This is particularly true for the generation of the PSF model, where the user is encouraged to manually 
review each star that has been input to the PSF generation section. While this toolkit comprises many tools, we only 
review the most relevant ones here. 

• SKY : DAOPhot uses the following algorithm to estimate the global sky value in the image for the purposes of 
object detection: 10000 pixels are chosen uniformly distributed across the image; the tails of this distribution 
are clipped; the mode is estimated as 3 x median — 2 x mean; and the RMS is derived from the 1 — a width of 
the sky histogram about the mean. 

• FIND : Based upon the user-input readnoise and gain relevant for each image, and sky as derived above, 
DAOPhot will compute the random error per pixel. This value is normalized by the inverse square root of : [sum 
of the squares of the values] - [the square of the sum of the values] of a bivariate circular Gaussian function 
with unit height and the user-supplied value of the estimated FWHM. This yields the estimated random noise 
in the Gaussian-convolved background image. A user-defined multiple of this value is used as the star detection 
threshold. This represents the minimum central height above the local sky for an object to be considered 
significant, not the integrated signal from the entire detection. 

• PHOT : This subroutine performs aperture photometry on a list of stars. In this process, all stars are subtracted 
from the image (using the current PSF model), and each star is individually added in turn to the image to 
estimate its aperture flux. The user chooses apertures for measurement, as well as an inner and outer radius 
for local sky determination (determined in a manner similar to SKY). A circular aperture is approximated by 
an irregular polygon by only accepting fractions of the flux in each boundary pixel, with a linear fractional flux 
scaling between 1 and for pixels within -0.5 and +0.5 of the aperture radius, respectively. In addition, PHOT 
performs an azimuthal smoothing within each annulus bounded by neighboring apertures to recognize hot pixels 
: if a given pixel is discrepant relative to the mean and dispersion of other pixels within the same annulus, the 
discrepant pixel value is replaced by a weighted average of the pixel value and the mean value for the annulus. 
This is useful for "curve of growth" corrections but not directly relevant to our analysis here. If the photometry 
process fails (e.g. the modal sky could not be determined, or there is a bad pixel in the aperture) the magnitude 
error is set to 9.999. Uncertainties in the magnitudes for good objects contain terms from : random noise inside 
the star aperture, including readout noise and contamination by other stars in the neighborhood, estimated by 
the scatter in the sky values (this term increases as the square root of the area of the aperture); the Poisson 
statistics of the observed star brightness; and the uncertainty of the mean sky brightness (which increases directly 
with the area of the aperture). 

• PICK : This subroutine chooses good candidates for PSF stars based upon their distance from the edge of the 
frame and local crowding conditions. In particular, stars near brighter stars or within a user-defined threshold 
distance are rejected. If at least 3 apertures are specified in the PHOT stage, PICK will use M2 — M3 as well as 
Ml — M2 to choose objects, under the assumption that M2 — M3 will be larger for extended objects than for 
stars. In principle we could use Photo-selected stars for this process, but have decided to allow DAOPhot to select 
them. 

• PSF : The use of this procedure is complex enough that we address it in detail in Section [17.2.6l and Section [17.2.7l 
In summary, this routine takes a list of objects (e.g. those selected by PICK) and builds a model of the point- 
spread function. 

• SUBSTAR : This subroutine accepts an input list of objects, scales and shifts the PSF according to each star's 
magnitude and centroid, and subtracts them from the image. This is useful when looking for faint neighbors 
which might contaminate the PSF determination (in this mode, one subtracts off the PSF stars and runs FIND) 
or when undertaking additional rounds of PSF fitting (where one subtracts off all faint neighbors and runs PSF). 
The pattern of residuals left by SUBSTAR is also a critical diagnostic for determining the quality of the PSF. 
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17.2.3. Star-Galaxy Separation 



As DAOPhot is explicitly designed to do stellar photometry, daophot does not have the ability to do high confidence 
star-galaxy discrimination. The safeguards that have been built in are primarily to discriminate against cosmic rays 
and instrumental artifacts, such as bad pixels and CCD bleed from saturated pixels. To reject detections around these 
features, daophot FIND calculates the following parameters per object : 

• Sharp : Ratio of : [the height of the best fit delta-function that fits the data] divided by [the height of the best 
fit Gaussian function that best fits the peak]. For cosmic rays, this should be larger than one. For bad negative 
going pixels, this should be close to zero. This statistic is primarily designed to filter against cosmic rays and 
bad pixels. The default tolerance for a good object in DAQPhot is a value between 0.2 and 1.0. 

• Round : To calculate this value, the data are summed along each dimension, and then fit with one-dimensional 
Gaussian functions along both x and y. The round parameter is the ratio : [the difference between the heights of 
the Gaussians] divided by [the average of the heights of the Gaussians]. An object elongated in the x-direction 
will have round < 0; in the y-direction, round > 0. This is primarily designed to filter against charge-overflow 
features. The default tolerance for a good object in DAOPhot is a value between -1.0 and 1.0. Note that objects 
elongated at oblique angles will not be preferentially rejected, thus this only marginally useful for star-galaxy 
separation. An additional roundness parameter is calculated that measures the four-fold symmetry of the 
detection as a safeguard against diffraction spikes. 

Clearly, neither of these statistics are optimal for doing star-galaxy separation. However, allstar also calculates 
the following parameters per object, which we ingest into our database as PSFChiSq and OrigClass, respectively. 

• Chi : A weighted estimate of the standard deviation of the residuals from the PSF fit. This is derived from : [the 
ratio of the observed pixel-to-pixel mean absolute deviation from the profile fit] divided by [the value expected 
on the basis of the noise properties]. The denominator is derived from the input gain and readnoise, Poisson 
statistics, some fraction of the total measured flux (input parameter PERCENT ERROR, default 0.75%) to allow for 
flat-fielding errors, plus an user supplied (input parameter PROFILE ERROR, default is 5%) estimated error of the 
fourth derivative of the PSF at the peak of the profile to account for uncertainties in interpolation. 

• Sharp : A parameter with the same name but different interpretation from daophot 's Sharp parameter. This 
Sharp is a goodness-of-fit statistic describing how much broader the actual profile of the object is compared to 
the profile of the PSF. Pixels within 6 half-widths of the PSF are included in calculation of the quantity : 



where 5 is the residual of the brightness of each pixel from the PSF fit and a is the anticipated standard error 
of the intensity of the pixel. Objects less extended than the PSF (such as cosmic rays) have Sharp smaller than 
1.0; objects more extended than the PSF (such as galaxies) have Sharp larger than 1.0. This Sharp parameter 
is an estimate of the intrinsic angular size of a given object, and should tend to the same mean value regardless 
of the seeing. 

We also emphasize that DAOPhot operates only with PSFs. Any galaxy it encounters (or any saturated star that 
has been interpolated by Photo and thus does not follow exactly the image's PSF) tends to get split up into multiple 
components. FIND will detect a peak at the galaxy centroid, and after subtraction of a PSF at this position, the 
remainder of the object fiux is modeled as multiple additional stellar objects. For this reason, DAOPhot photometry 
for galaxies and the very brightest stars is not to be trusted. This also causes difficulties in the OPTICS clustering runs 
(Section [7]), since a single galaxy may have multiple components from DAOPhot. 

We also suspect that this is one reason DAOPhot finds more objects than Photo : it splits up galaxies (or saturated 
stars) into multiple components, which then cluster with other DAOPhot-reduced runs or filters, but not with Photo. 



The process of object deblending is not strictly supported in DAOPhot, insomuch as the object detection (daophot) 
and PSF photometry (allstar) portions of the code are decoupled. What happens in practice to a blended pair is 
that the bright component is detected in FIND, photometered in allstar, subtracted from an image using SUBSTAR, 
and its blended neighbor revealed in a call to FIND on the star-subtracted image. This pair of detections is then sent 
along with the original science image to allstar, which then attempts to deblend them using the PSF. This process 
does not always succeed, and allstar is able to merge stars into a single detection if S/N criteria are not met. It 
cannot however add a component to the fit if it feels additional deblending is required. 

It is important to note that all objects are assumed to be stellar. This approach will fail in the general case where 
there are significant numbers of background galaxies in the field, but should succeed in the case of very crowded stellar 
fields, such as globular clusters. 
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allstar checks objects for nierger if they are separated by 1 FWHM of the PSF. Objects are considered merged 
if they are separated by less than 0.375 the FWHM. For neighbors with separation between 0.375 and 1.0 times the 
FWHM, allstar will merge them into a single detection if the signal-to-noise of the object with the largest magnitude 
error is smaller than a given threshold. This value increases from 1.0 for iteration number 5 of allstar up to 2.0 for 
iteration 15 and beyond. An object is considered to have converged once its determined to have a S/N > 2.0. 

The process of merging objects yields a composite centroid from the weighted means of the most recent centroid 
estimates of both stars, and a composite brightness from the sum of brightnesses of both elements. This object is then 
marked for analysis in the next iteration of allstar. 

The program allf rsune uses a similar set of criteria for deblending. In this case, objects are considered critically 
blended if they are within 0.375 times the FWHM of the best-sampled frame in which they both appear. 

17.2.5. How We Married DAOPhot to Perl 

Since the DAOPhot package is more of a toolkit than a pipeline, to make it into an automated pipeline we have 
chosen to use the Perl scripting language. These scripts were derived from the thesis work of Becker (2000), and were 
designed to perform automated crowded field photometry on Galactic bulge and LMC images taken on the CTIO 0.9m 
telescope. 

In Perl, daophot (and allstar) is opened as a filehandle to which commands may be written. This is accomplished 
in the following way 

$daopid = open (DAOPHOT, 'Mdaophot » $out_file"); 

The filehandle DAOPHOT is written to using simple print commands, such as 
print DAOPHOT "$re_dao\n"; 

where the variable $re_dao contains the image readnoise. In this way, we are able to send commands to the program 
as if we were typing them on the command line. 

Through trial-and-error, we have determined the sequence of prompts requested by daophot and allstar for a 
given command sequence, as well as the diversity of variations allowed. Our Perl script is designed to itself recognize 
each possible fork (e.g. if a file exists, do you overwrite it?) and send daophot the appropriate commands. We are 
thus able to replicate an interactive session with our automated scripts. 

17.2.6. The Point Spread Function in DAOPhot 

DAOPhot is very flexible in how it handles its PSF, and we believe this flexibility is one of the main reasons that it 
performed so well in our precision tests. 

The DAOPhot PSF model is a combination of two components : an analytic approximation to the true PSF; and 
a pixel-wise look-up table containing the average deviations of the true PSF from the analytic model. There are 6 
analytic models for DAOPhot to use^ : 

• A Gaussian function, having two free parameters: half-width at half-maximum in x and y. The Gaussian 
function may be elliptical, but the axes are aligned with the x and y directions in the image. This restriction 
allows for fast computation, since the two-dimensional integral of the bivariate Gaussian over the area of any 
given pixel may be evaluated as the product of two one-dimensional integrals. 

• A Moffat function, having three free parameters: half-width at half-maximum in x and y, and (effectively) a 
position angle for the major axis of the ellipse. Since it's necessary to compute the two-dimensional integral 
anyway, we may as well let the ellipse be inclined with respect to the cardinal directions. In case you don't know 
it, a Moffat function is 

1 

^ (l + z2)/3 

• where is something like x'^ /al. + y^ ja^y + a.j;yxy (Note: not . . . + xy/a^y so a^y can be zero). In this case, 
(3 = 1.5. 

• A Moffat function, having the same three parameters free, but with /? = 2.5. 

• A Lorentz function, having three free parameters: ditto. 

• A "Penny" function: the sum of a Gaussian and a Lorentz function, having four free parameters. (As always) 
half-width at half-maximum in x and y; the fractional amplitude of the Gaussian function at the peak of the 
stellar profile; and the position angle of the tilted elliptical Gaussian. The Lorentz function may be elongated, 
too, but its long axis is parallel to the x or y direction. 

* These descriptions are hftcd verbatim from the DAOPhot manual 
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• A "Penny" function with five free parameters. This time the Lorentz function may also be tilted, in a different 
direction from the Gaussian. 

It is perhaps worth noting that the data are not fit to an actual analytic profile, but instead to the function as 

integrated over the area of each pixel. 

The look-up table is allowed to vary spatially in a constant, linear, or quadratic fashion. The table has a resolution 
of one half pixel, centered on the ccntroid of the stars. It is necessary to both cleanly subtract off all neighbors and 
accurately determine the centroids of the objects for this mechanism to work optimally. High order terms of the look-up 
table have zero volume, so that the volume of the PSF is constant across the image. 

DAOPhot has the option to automatically choose which analytic model best fits the data, using as a metric the RMS 
of the residuals as a fraction of the peak height of the analytic function. In practice, we allow DAOPhot to fit all 6 
models to the ensemble of data and select the best fit profile. This leads to significant computational overhead, and is 
one culprit for the slowness of DAOPhot relative to the other algorithms. 

After DAOPhot has chosen the best model, it displays the star-by-star RMS residuals, as well as indications that it 
thinks a particular star is saturated, too near to the edge of the image, or has a RMS larger than 3 times the average. 
It is this list of RMS residuals that we need to parse in Perl. We use this RMS distribution to reject stars that fit the 
PSF model poorly, and then re-send the list of acceptable stars to the PSF stage. 

17.2.7. DAOPhot in Practice 

In our typical runs, we start with a high-threshold FIND command to locate bright stars. We run PHOT on the objects 
and SELECT the 800 brightest and most isolated objects in the image to use as the inputs to the initial PSF generation 
stage. 

In this first stage, we fit a pure analytic model with no lookup table. The program selects the best of the 6 analytic 
models, and lists the resulting RMS values star by star. We parse this list in Perl and reject those candidates that have 
more than 2.7 times the median RMS. The list of good objects is sent to allstar to determine positions, brightnesses, 
and local sky values. 

At this point in time, we want to start building up the complexity of the PSF by adding a look-up table. We would 

ideally subtract off the PSF stars, and run a FIND on the residual image to detect faint neighbors, subtract off only 
these objects using SUBSTAR, and re-run PSF on the now - isolated PSF stars. Blended neighbors have a relatively small 
effect on the analytic model, but can contaminate the look-up table significantly. 

However, because of the complexities of the SDSS PSF, we encountered problems with DAOPhot finding incorrect 
initial centroids of the stars (meaning the PSF model was not exactly and consistently centered on the objects). Since 
the PSF model is incomplete at this stage, and we were not yet using a look-up table, the residuals between the 
analytic model and the true PSF were being detected by DAOPhot as entirely new objects in FIND. Thus every bright 
star was split in twain : the original detection, and the residual of this detection from the initial PSF model. DAOPhot 
was not inclined to merge these detections into a single object, and we ultimately ended up with an incomplete PSF 
model and multiple detections per star. 

We decided that we needed to first build a more complete model of the PSF before doing neighbor detection. 
This would allow allstar to successfully ccntroid each object, to allow PSF to build a more accurately centered 
model. Essentially, we had to build up a better approximation of the PSF so that we could generate a more accurate 
PSF downstream. This process of bootstrapping seemed to solve the problem, but also slowed down the processing 
significantly. It also required that we start the PSF modeling process with many objects (we chose 800) since we 
wanted to beat down the systematics in the initial look-up table due to un-subtracted neighbors. 

Therefore we first increase the complexity of the PSF to include a look-up table without spatial variation, and re-run 
PSF without neighbor subtraction at this point. Candidates with more than 1.8 times the median RMS are rejected. 
This culled list is re-sent to PSF. We iterate this procedure until the list converges or we reach 3 iterations, whichever 
comes first. In addition, we halt the sigma-clipping process if the number of PSF stars falls below 100. This culled 
list is then re-sent to allstar to yield an updated list of PSF stars. 

We send this new list to daophot and again increase the complexity of the PSF look-up table to include linear 
variation across the image and repeat the above loop, rejecting objects with more than 1.5 times the median RMS, 
and sending the culled list to allstar. 

At this point, we run a FIND on the PSF— star-subtracted image to find blended neighbors. This list is appended to 
the PSF-star list, and the ensemble is sent to allstar for joint photometry. Allstar ideally deblends neighbors and 
merges spurious detections, yielding accurate centroids. We use SUBSTAR to remove only the neighbors from the image. 
Finally, the PSF is generated on the neighbor subtracted image, using quadratic spatial variation in the look-up table, 
and rejecting objects with more than 1.5 times the median RMS. This yields our final PSF model. 

We next detect all sources in the image by : calling FIND with the final FWHM as derived from the PSF; running 
allstar to photometer and subtract the objects; running FIND on the star-subtracted image to detect blended or dim 
objects; running allstar on the merged star list, yielding another star-subtracted image; and a final run of FIND and 
allstar to produce the final PSF photometry per image. This list is sent to PHOT to produce aperture photometry 
results for the entire list of objects. 
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We decided to produce allframe results by hand for a subset of our data because this algorithm is the closest 
existing piece of software to the envisioned LSST Image Processing Pipeline and its aggregate analysis of all images 
of a given sky patch. We used the field of globular cluster M2 (NGC 7089) for this analysis. Photo frequently fails to 
reduce of this field due to its extreme crowding conditions. Thus it presents an opportunity to explore the parameter 
space opened by DAOPhot and allframe. 

We ran the standard DAOPhot reductions of this field, and fed the derived star lists from all 5 passbands and both runs 
into daomatch. We used the 5-band image from run 3437 as the reference astrometric frame. We next ran daomaster, 
matching up all objects in a 1-pixel (in the reference image) radius with quadratic transformations. This matching 
radius was monotonically decreased to 0.1 pixels, yielding an initial star list of ^ 8000 matches. The derived star list 
and transformations were fed to allframe, which produces star-subtracted images for each input image. These images 
were co-added using montage2, yielding an image containing all objects not matched in the daomaster stage. We 
next ran FIND on this image, and then allstar using the point-spread function of the reference image (a reasonable 
approximation since we only want initial centroids, which will be recalculated in subsequent calls to allframe). This 
starlist was appended to the results of daomaster and the images were re-fed into allframe. We ran an additional 
FIND and allstar on the co-added residuals of this second allframe run. The final star list was derived from a third 
and final allframe run on the images. 

17.2.9. Processing Time 

We found the preceding protocols sufficient to produce good results from DAOPhot and allframe, but it is likely that 
not all of it was necessary. The amount of over-design in the construction of the PSF is large, and this overhead can 
almost certainly be reduced. We did not test this parameter space, instead choosing to exercise the algorithm with 
very conservative (and time-consuming) settings. 

We address several points that affect the run-time of DAOPhot : 

• PSF fits 6 models to the ensemble of data every time it is called (up to 10 times per image). This yields a 
factor of 60 in run-time compared to the generation of a single PSF. This could be sped up by choosing a single 
analytic model to use, one that most closely approximates the characteristics of your data. With the inclusion 
of a look-up table in the PSF, the overall differences when using the different analytic models should ideally be 
minimal (assuming you can build a high-fidelity look-up table) . In practice, it is the case that you want capture 
as much of the PSF in the analytic portion of the model. 

• We decided to use a large number of stars (800) to initially feed to PSF, assuming (rightly so) that many would 
be rejected in our sigma clipping iterations. This is 1 PSF star for every 100x100 pixel patch in the 2048x4083 
image, perhaps a factor of 10 larger than is needed. The final PSF model tends to be derived from 200-300 stars. 

• The executables daophot and allstar are run approximately 30 times in the normal mode where we generate 
the PSF and detect and photometer all objects in the image. Each of these calls loads the image from disk. 
Some processes write temporary files to disk. And for each call, the output stream is captured and parsed by 
the controlling Perl scripts. This is clearly inefficient at the system level. A tighter integration between the 
processing software and its various components (e.g. the individual executables daophot and allstar) and the 
controlling software (middleware) would yield a vast improvement in system load. 

Overall, our automated implementation of DAOPhot is very inefficient but produces satisfactory results. Our pipeline 
would benefit greatly from tighter integration of the application and its controlling middleware. However, we feel that 
the most improvement to be gained is in the generation of the PSF. Had we known a priori the locations of PSF stars 
and fed them directly to the PSF generation stage, we could have sped up the processing dramatically. We recommend 
that LSST builds and then uses on a nightly basis a master list of PSF stars to assist in this computation. 

17.3. DoPhot 

The DoPhot package (ISc hechter et al.|[l993l ) is designed to robustly produce a catalog of stellar positions, magnitudes 
and relatively crude star/galaxy classifications for detections from astronomical images. Like SExtractor DoPhot was 
designed to wo r k on a large number of images quickly with little to no interaction with the user. According to 
ISchechter et~all (|1993f) it was in fact, optimized to handle large numbers of poorly sampled, low S/N images. The 
major caveat made by the authors states that DoPhot may not be the optimal program (sacrificing completeness and 
accuracy) for use on datasets that differ dramatically from the data it was originally designed to work on. 

The version of DoPhot tested here is not the original software implementation as designed bv lSchechter et all ()1993f ). 
The original FORTRAN source code was translated, using f 2c, into C-language code by I. Bond of the MOA Mi- 
crolensing Collaboration. Much of the elegance of the original source code was lost in translation, and the resulting 
code is extremely difficult to interpret. Many of the subsequent changes to DoPhot were done in order to be able to 
do photometry in difference imaging (forced photometry, photometry on images with zero background, etc.). Never- 
theless, it has been extensively modified to operate robustly in the Photpipe environment. We emphasize that the 
original software should not be implicated for any shortcomings in the analyses presented here. 

Given the uniqueness of SDSS drift-scan data and the complexity of the PSF for these images, we set out to investi- 
gate the usefulness of DoPhot with respect to the other algorithms described in this section with little expectation that 
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DoPhot would measure up. As demonstrated below, the numerous input parameters and complicated implementation 
of the source code have made a thorough investigation of DoPhot 's capabilities nearly impossible in the time frame 
given for this study. We caution the reader that the results we quote in the following sections for DoPhot may not be 
representative of the full capabilities of DoPhot. 

17.3.1. An Overview 

To enable DoPhot to run within the Photpipe framework, the C code version we used has been wrapped in an 
extensive amount of Perl. For our study, several additional modifications to both the Perl code and C code were 
necessary to accommodate the SDSS images. In particular, we added the second moments (sigx, sigy, sigxy), the 
(chisqr), and PSF magnitudes and errors to the default DoPhot output parameters. 

17.3.2. Object Detection and Measurement 

DoPhot returns both aperture magnitudes (again, using the optimal Photo aperture of 37.17 pixels) and PSF mag- 
nitudes and respective uncertainties. The PSF is based on an analytic model, consisting of similar ellipses of the 
form 

=/o(l+z2 + l/2/34(z2)2 + l/6/36(z2)3)-l+/^ (3) 

where 

= [-l/2{^ + 2a, yXy+^% (4) 

x^{x -Xo);y^{y ~yo)- (5) 

This function is not allowed to vary spatially, putting this software at an extreme disadvantage compared to Photo 
and DAOPhot. This is particularly true for SDSS data, since temporal PSF changes (and the PSF is always changing) 
in drift-scanned data translate into spatial PSF variation in the images. 

DoPhot uses the initial inputs (user defined) for the seeing, background sky and the instrument to identify objects. 
After this first pass through the data DoPhot improves its initial estimate of the shape of the object by fitting the 
model of a typical star to a number of subrasters centered on a variety of detected objects. It does this until it finds 
the optimal model (star, galaxy, double star, cosmic ray,) for each object (as described below). In much the same 
fashion as DAOPhot, the detected objects are subtracted from the image and another detection pass is performed and 
the object classification routine is rerun to improve the model. 

DoPhot produces a noise image which weights each pixel in its non-linear least squares fitt ing routine. This is al so 
used to determine if the detection is sufficiently above the background or should be rejected (|Korhonen et al.ll2005D . 

17.3.3. Star/Galaxy Separation 

DoPhot makes a crude attempt at separating a potential star from a double star or galaxy by comparing the shape 
parameters of the object to the given initial guesses for a "typical" stellar shape in the parameters file. If these shapes 
differ significantly and are larger than the specified footprint, DoPhot attempts to fit two typical stellar profiles to the 
object. If this too fails to meet a user specified threshold, the object is then classified as a galaxy. Discrimination 
between galaxies and double stars can be adjusted with the STARGALKNOB parameter . 

DoPhot returns one of nine different object types : 1 = star, 2 = galaxy, 3 = double star, and 4-9 fiag the object for 
a variety potential issues with the object and/or image that prevent a definitive classification. 

17.3.4. Crowded Field Photometry Comparison 

DoPhot does a relatively good job on crowded fields. DoPhot does better than SExtractor under most circumstances 
but worse than DAOPhot. According to the accompanying manual, tweaking the STARGALKNO B parameter wil l allow 
DoPhot to do better at discriminating double stars from galaxies at low galactic latitudes. iFerrarese et al.l (|200ClD 
discuss the effect of using DoPhot on cosmic ray-cleaned images and crowded fields. They report that DoPhot has the 
tendency to overestimate the sky brightness significantly when cosmic rays are present. We used fully reduced and 
cosmic ray-cleaned SDSS images for our tests and were not sensitive to this effect. 

As in all packages, around bright stars residuals from the PSF subtraction may trigger the false detection of new 
objects on the residual flux. To compensate for this, DoPhot adds noise to the noise image it produces every time it 
subtracts a new detection from the image. However, this reduces the efficiency with which DoPhot can detect faint 
sources near bright objects. 

17.4. SExtractor 

The SExtractor package^ is designed to quickly produce reliable aperture photometry catalogs on a large number 
of detected sources from astronomical images. Aside from the ease of installation, SExtractor is also notable for its 
speed and versatility. Aside from Photo, it is one of the few packages that promises to distinguish and photometer 
both stars and galaxies. 

^ http:/ /terapix. iap.fr/soft/sextractor/index. html 
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17.4.1. An Overview of the Software 

SExtractor uses autoconf to configure the software to the particular system it is being installed on, making it 

extremely portable and flexible. It comes with an ensemble of runtime configuration files, including a list of default 
input and output parameters, neural network weight files for star-galaxy separation, and convolution masks to assist in 
object detection. SExtractor is but one part of a larger data processing environment that also includes EyE (Enhance 
Your Extraction, ^''), which allows you to generate non-linear filters that may be used for adaptive filtering and feature 
detection in SExtractor. 

SExtractor itself uses a custom FITS interface dcrivc^d from the Leiden Data Analysis Center (LDAC) toolset, and 
the WCSLIB^^ library to perform pixel-to-sky transformations. 

17.4.2. Object Detection 

One of the most difficult issues in photometry is the accurate determination of the sky background. In SExtractor, 
the background is determined locally in each mesh of a user specified grid that covers the image. Sigma clipping of 
pixels occurs until convergence at ±3(7 about the median. If the sky estimate has changed less than 20% from the 
initial estimate, the mean of this clipped histogram is considered the sky. Otherwise the sky is estimated as the mode 
as 2.5 X median — 1.5 x mean. Note that this is different than DAOPhot's definition of mode. These values are median 
filtered to avoid the influence of individual bright stars, and the global background model is derived from a bicubic 
spline fit to the mesh value. 

The background subtracted image is convolved with a filter optimized to detect the objects of interest in the image. 
This correctly suggests that choice of filter is essential. For example, the optimal filter to detect stars is the PSF flipped 
about the x and y axes. This occurs in practice by approximating this function with a symmetric Gaussian whose 
full-width at half-maximum is similar to the PSF FWHM. However, this filter is not optimal for galaxy detection, 
since galaxies are generally broader than the PSF, and oriented arbitrarily. In crowded fields, this convolution process 
tends to blend neighboring objects together, and without a PSF model makes it difficult to "segment" or "deblend" 
neighboring objects. To assist in this problem, SExtractor provides filters to use under varying seeing conditions and 
optimized to detect Gaussian functions (stars), extended low surface brightness objects, or wavelet features designed 
for crowded field detection. Ideally, one should develop filters with EyE optimized for the features one wants to detect, 
and apply these filters in SExtractor's filtering steps. 

17.4.3. Deblending 

SExtractor groups significant neighboring sets of pixels in the filtered image into "segments", allowing connectivity 
at the sides or corners. The user sets the threshold above which pixels are considered significant with parameter 
DETECT_THRESH. Segments must have at least DETECT J4INAREA pixels above this threshold to be considered significant. 
SExtractor attempts to deblend each segment by building a model of how the segment bifurcates into different objects 
as the detection threshold is diminished. The decision to regard a branch as distinct is based upon its relative integrated 
intensity. If the integrated pixel intensity of the branch is greater than a certain fraction of the composite object, it is 
considered distinct. The default parameters allow a contrast of approximately 6 magnitudes in blended objects. 

17.4.4. Object Measurement 

After detection and deblending, SExtractor characterizes each source. Only pixels above the detection threshold 
are considered. In general, the user requests a subset of desired characteristics from the longer list of parameters 
SExtractor is able to measure. However, some of the isophotal measurements are required by SExtractor, and are 
performed even if not requested by the user. 

As an example, the isophotal 2"*^ order moments are calculated from the image as follows : 

<x^ > = <x>^ 

z-^ies 

<xy > = — — <a;>*<y> 

However, isophotal measurements are not optimal, in that they are sensitive to the thresholding level. In SExtractor 
versions later than 2.4, "windowed" measurements of positions and shapes are allowed. These include a Gaussian 
weighting, similar to the adaptive second moments used by Photo. While more robust than isophotal measurements, 
they are derived iteratively, and thus more computationally expensive. 

SExtractor is capable of determining magnitudes in five different ways. Each of these parameters is discussed in 
detail in the users guides available on the TERAPIX site given above. We have distilled the information on these and 
other main features of this package here for completeness but refer the reader to the manuals for further details. 

^'^ http;//terapix.iap.fr/soft/eye/index.html 

http://www.atnf.csiro.au/people/mcalabre/WCS/ 
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• MAG_ISD: isophotal magnitudes - SExtractor uses a user defined threshold for detection as the lowest isophot 
(pixels above the threshold minus the background). This uses the DETECT_THRESH parameter in the setup file. 

• MAG_ISOCOR: corrected isophotal magnitudes - retrieves the amount of flux in the wings of the isophotal (Gaussian) 
area. 

• MAG_AUTO: automatic aperture magnitudes - from Kron-like elliptical apertures. 

• MAG_BEST: Choice between ISOCOR and AUTO - typically AUTO unless nearest neighbors influences photometry 
by more 10%. 

• MAG_APER: fixed-aperture magnitudes - user defined circular apertures. 

• MAG_PETRO: pctrosian aperture - similar to AUTO's Kron-like aperture (as of version 2.4.4) with different radius 
but similar position angle and ellipticity. 

17.4.5. Star-galaxy Classification 

SExtractor uses a neural-network-based star/galaxy classifier which allows it to do a primitive classification of 
objects (returned as CLASS_STAR). This classifier may be augmented by using the EyE package^^ to design more 
complex classifiers. 

The object classification in SExtractor is designed to detect and classify both galaxies and stars using a neural 
network output. SExtractor begins its object classification with the pixel scale of the input image and a user supplied 
estimate of the seeing FWHM. The neural network uses these values to make an initial rough guess about object shape 
and size on the image. The final classification for an object is designated by the CLASS_STAR parameter and has a 
fractional value between and 1. SExtractor considers a zero to be a galaxy and a one to be a star. In Section [9] 
we show exactly how easily the values between and 1 can be reliably interpreted as either a galaxy or a star using 
Photo's galaxy/star classifications as "truth" for each object and comparing the results. 

Parameters for the detection and analysis thresholds (DETECT_THRESH , ANALYSIS_THRESH) and deblending 
(DEBLEND_MINCONT, DEBLENDJJTHRESH) can be set to improve the the detection rate and quality. Note however that 
much like DAOPhot, if given too fine a deblending SExtractor may deblend large galaxies into several individual 
objects. 

CLASS_STAR behaves as a sharply-tuned Bayesian classifier. Results can become unreliable when the actual PSF 
shape is different from what it was trained with (Moffat-like), or when the user-provided SEEING_FWHM is inaccurate. 
Asymmetric PSFs and strong variations in the PSF across the field are additional factors that limit the accuracy of 
the classifier. These effects are frequently seen in large-area CCD mosaics. Because of these shortcomings, using 
CLASS_STAR for star/galaxy separation is generally not recommended in large surveys. A preferred method is to use 
FLUX_RADIUS (the radius of the disk which contains half of the flux) as well as its variation across the image. 

17.4.6. Using a PSF Model 

Because of SExtractor's robust deblender, it does a reasonable job at performing photometry in crowded fields. 
The software will process the images to completion, although the output catalog should be closely inspected to verify 
the level of deblending was appropriate. It is more robust than Photo in this regard, as Photo is known to fail at the 
deblending stages in the most crowded of fields. However, the photometric accuracy of SExtractor in crowded fields, 
and for faint sources, has generally been limited by the lack of a PSF model. 

Contrary t o most literature sources, SExtractor can perform PSF photometry and position measurements (see 
IKalirai et al.l[200 1a.b: Berlin 200l . for examples). The PSFEx^^ package provides this functionality. This is accom- 
plished in three steps: (a) make an initial pass through SExtractor, and create a binary catalog containing small 
images around each bright source; (b) pass this catalog through PSFEx to create a model describing the PSF and its 
variations; (c) rerun SExtractor requesting parameters such as MAG_PSF, MAGERR_PSF, etc. At this stage, there are 
still completeness issues in very crowded fields, which has prevented the public release of the PSFEx package. 

17.4.7. SExtractor In Practice 

Unlike DoPhot and DAOPhot, SExtractor is relatively straightforward to use within the framework of the Photpipe 
pipeline, requiring little initial setup and no modifications to the source code. 

The parameters we used in our test runs with SExtractor from the setup file (default. sex) and the requested output 
catalog parameters (default_sex.params) can be found in the Appendix. In particular, the parameter NUMBER is a 
rimning number use for cross identification and not recorded in the database. X_IMAGE, X2_IMAGE, Y_IMAGE, Y2_IMAGE, 
and XY_IMAGE have been depreciated in the new version in favor of the new Gaussian-windowed measurements. As is 
demonstrated in the photometric analysis, the windowed measures are vastly superior to the old parameters, which 
were essentially isophotal quantities. For completeness, we requested the MAG_APER and MAGERR_APER values in a 
37.171 pixel aperture (7.36 arcsec at 0.396 arcscc/pixel), which is the aperture we chose to use for Photo's aperture 
photometry. 

"Enhance Your Extraction", http://terapix.iap.fr/soft/eye 

While PSFEx has not been officially released, the software may be downloaded from the TERAPIX public repository at 
http: / / terapix.iap.fr / ws vn/ index / public / software / psfex / 
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17.4.8. Crowded Field Photometry Comparison 

How well does SExtractor perform in crowded fields? Relatively well if deblend and threshold parameters are set 
at reasonable values for your images. The unavoidable end result is that SExtractor's neural network breaks d own 
at the low magnitude end, especially when it comes to detecting faint galaxies in crowded fields. iHolwerdal (|2005l . and 
references therein) suggest two novel approaches to detecting these faint galaxies using SExtractor. 

The first involves the use of DADPhot to first subtract all objects DADPhot detects as stars in the crowded field and 
save the subtracted image. DADPhot is essentially optimized for such a task. Without the influence of the additional 
stars in the image, SExtractor does a better job at finding faint gal axies, although we do not explore this claim in our 
report. The second involves the use of two (or more) color images. iGonzalez et al.l ()1998l ) use B — I images to detect 
sources instead of using the single color images. The major disadvantage of this is the increase in noise associated 
with the image, which will in turn produce more spurious SExtractor detections. 

17.5. Previous Tests Involving DAOPhot, DoPhot, and SExtractor 

There are a few noteworthy studies in the literature that investigate the usefuln ess of t he algorithm s in this study . 
Most of th e algorithm comparisons found in the literature and on the web (e.g. lAlardl i2000: Fcrr arese et al.l [20001 : 
[NciU 2GG5: lStaude fc Schwopd[200l iKorhonen et al.l[2005t iSmolcic et"aII[2Q06l) use SExtractor version 2.3.2, DoPhot 
version 2.0, and/or DAOPhot version II in their analysis. The latest versions used in this analysis of SExtractor (version 
2.4.4), DoPhot (version 3.0) and DAOPhot (version IV) include significant upgrades and enhancements over their older, 
well used, and well studied predecessors. 

17.5.1. \Smolcic et al\ (200 fi) : Assessment o/ DoPhot for Crowded Field Photometry 

This study implements a new pipeline designed around a version of DoPhot v2.0 that was wrapped in C and 
compiled under f2c by E. Magnier. They use this pipeline on crowded fields where Photo gives poor results. Instead of 
determining the repeatability of their photometric measurements or comparing their photometry to another algorithm 
as we have done with Photo, the authors use DoPhot's PSF model to generate synthetic stars and place them on an 
image through Monte Carlo simulations. They created both sparse and crowded fields and quantify their completeness 
at different magnitudes as the ratio of the number of artificial stars extracted by DoPhot to the number of artificial 
stars on the frame, noutput/n^nput- 

Their completeness for sparse fields is comparable to that of Photo at the bright end (~95%-99%) and falls below 
90% at magnitudes fainter that 20-21 (filter dependent). Photo is quoted as having 95% completeness for magnitudes 
between 21.3-22.2 (for g,r,i). For magnitudes brighter than 21 {g,r,i) our recovery of stars as compared to Photo is 
83%(i)-93%(5) for Run 343 7 and 8 7%(z)-96%(q) for Run 4207 (refer to Table E]). 

For crowded fields Smolcic et al.l f|2006.) find that in regions of high stellar density (center of Leo I) there is no 
appreciable effect on the number of synthetic stars recovered to a magnitude limit of ^ 20. At fainter magnitudes 
and stellar densities of ~ 200 stars/arcmin^ their completeness suffers a 10%-30% decrease in the number of stars 
recovered by DoPhot. 

The success of the ISmolcic et al.l (2006) DoPhot pipeline in crowded fields is likely due to their attention to the 
background sky model. We used the simple uniform gradient model which is supposed to give a reasonable description 
of the background sky. The ISmolcic et al.l (|2006f ) pipeline uses the modified Hubble profile model and estimates the 
seeing and background sky directly from each image. They claim this gives them a better detection rate in crowded 
fi elds by a facto r of ^ 3. 

'S molcic et afl ([2006h were most concerned with detecting sources in the crowded field SDSS images of the dwarf 
spheroidal galaxy Leo I and apparently were less concerned with detecting faint galaxy sources and the accuracy of 
their astrometry as they do not discuss any analysis or fine-tuning of their pipeline to accommodate these techniques. 

17.5.2. \Ferrarese et al\ ti200A ) : Comparison o/ DoPhot and DAOPhot/allframe on Crowded Stellar Fields 

This study tests both DoPhot and allf reime using artificial star simulations with a variety of complex backgrounds 
and stellar densities for crowded fields observed with HST/WFPC2. Their goal was to determine the distances to 
Cepheid variables and investigate the effect, if any, these two packages had on the distance determinations. The 
authors find that when using DoPhot it is crucial the frames have cosmic rays removed, otherwise DoPhot tends to 
overestimate the sky brightness, allf rame photometers all frames simultaneously which allows it to easily fiag and 
ignore cosmic rays. Our frames were cleaned of cosmic rays prior to using DoPhot, and therefore not significantly 
affected by this bias. 

DoPhot photometry on their artificial frames was found to be more complete that allf rame. DoPhot and allf rame 
agree to within 0.05 magnitudes (within uncertainties for aperture corrections). In crowded field regions, confusion 
noise and rapidly varying background contribution resulted in stars being measured consistently too bright 25% 
for DoPhot and ~ 5 — 10% for allf rame. This effects the photometry for single-epoch observations significantly. For 
DoPhot the effect can be as little as 0.05 magnitudes in moderat ely crowd e d fie lds and as large as 0.2 magnitudes 
for the most crowded of their observed fields. Surprisingly, Ferra r^e et al.l ([2000) find that this bias is worse when 
allf rame photometry is used. 

Their overall conclusion was that both packages are equally suited to determining the distances to Cepheid variables 
with allf rame underestimating the distances by 1% and only slightly larger for DoPhot (2%). 
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17.5.3. Other DoPhot Studies 

iBellazzini et al.l ()2004[ ) use a version of Do P hot m odified by P. Montegriffo (Bologna Observatory) to read images in 
double precision format. Like lSmolcic et al] (|2006( ). they use images seeded with synthetic stars to confirm that their 
photometric uncertainties are small and that blended sources do not impact their analysis in any significant way. They 
report a completeness of over 80% over the range i i i mag nitudes for their sample^ 

A similar analysis is performed byl Reid fc Mouldl ()1991f) using DoPhot (see also' Vogt et al.lll995l:lGallart et al.lll999D . 
They also perform a limited i-band comparison between DoPhot and DAOPhot where they find that DoPhot does a 
better job at estimating the sky background in the crowded field images. DoPhot systematically finds faint stars to 
be brighter in magnitude than DAOPhot, and attributes this to DoPhot determining the sky background from the fully 
subtracted frame, whereas DAOPhot computes the background before star subtraction resulting in a difference of less 
than 1% in the computed sky backgrounds (DoPhot 's is lower). 



